Sampling and sampling
techniques
Dr. Moumita Pal
MBBS,DPH, MD
Dept. of Community Medicine
College of Medicine and Sagore Dutta Hospital
Sampling
It is process of choosing a representative
sample from a target population and
collecting data from that sample in order to
understand something about the population
as a whole.
• Universe ( Whole population): entire group of the
study population is known as universe or whole
population. Represents the complete set of
individuals, objects or scores in which we are
interested.
• Sampling unit: each member of the whole
population
• Sampling frame: a list where all individuals from
the whole population are drawn up is known as
sampling frame.
• Sample: a small representative part of the whole
population
Why sampling?
• To cut down financial cost for data collection,
processing and reporting.
• To cut short the time and resources
• Information collected from a sample is
accurate i.e. valid and reliable
Target population entire population Adult population
Study population subset of target population in field practice area
Study subject sample drawn from study population selected adults
Sample size calculation
• For descriptive study
• For analytical study
Sampling technique
• 1. Probability
• 2. Non probability
Non probability
• Sample selected deliberately by the researcher
on his own choice
• 1. Purposive sampling (judgmental sampling):
participants are purposively selected from whom
information can be obtained easily.
• 2. Convenience sampling: participants are
selected on the basis of easy accessibility.
• 3. Self selection sampling: participants take part
in the research on their own as a volunteer
• 4. Snowball sampling: ( network sampling) : in this
process one study subject is asked to identify persons
with the same exposure in question for the purpose of
finding the next subjects. (When target population is
hidden eg: HIV/AIDS, drug addict, sex worker etc.)
• 5. Quota sampling: researcher are given quotas to fill
from different strata of population keeping the
proportions of quota same as observed in the
population. Hindu, Muslim - 60%, 40 %, can choose
participants by his own choice in same ration
Probability sampling
• Superior to non probability sampling.
• Obeys law of probability and base on concept
of random selection
• Known as random sampling or chance
sampling
Simple random sampling
• Each member of population has an equal chance
of being chosen ( guarantees the sample chosen
is representative of the population)
• Applicable when sample size is small
homogeneous and readily available.
• Complete list of population must be available as
sampling frame
• First all sampling units are assigned with numbers
• Then sample can be selected by random number
table or lottery method.
Systematic random sampling
• For large scattered heterogeneous population
• All sampling units are assigned with number
• A random starting point chosen 1st then every
nth number has been chosen.
• n is sample interval= total population/sample
size
• 1st unit as random and others as systematic
nth unit
Stratified random sampling
• For heterogeneous population ( when we want to
know distribution according to particular
variable)
• 1st heterogeneous group divided into small
homogeneous groups: called Strata
• From each group required number of sample
units taken by simple or systematic random
sampling in proportion to its original size
• Strata should be mutually exclusive and
exhaustive
Cluster sampling
• Dividing the population of interest into geographically
distinct groups/clusters
• Used when units of population are natural groups or
clusters like blocks, wards, villages, slums etc. If related
to geographical area: called Area sampling
• The 30 cluster sampling technique: 30*7 sample
developed by WHO
• From list of all cluster select 30 clusters= 1st step
• Selection of 7 interview site= 2nd step
2 stage sampling
• Primary sampling unit/secondary sampling unit
• Used for evaluation of immunization coverage
of districts, attitude of people towards
immunization, contraception, intervention
program etc
• ADVANTAGES: for a large geographical area
where list of household is not there, time
saving, less costly, sample size is less
• DISADVANTAGES: gives higher standard error
than other sampling design
• Selection of cluster from primary sampling
units :
• 1. simple/systematic random sampling
• 2. probability proportionate to population size
(PPS):
Probability proportionate to
population size (PPS):
• List of village, town or wards with respective
population/household numbers prepared
• Say among 30 clusters 10 cluster has to be taken
• Cumulative population of 30 cluster calculated
and divided by 10.= sampling interval (SI)
• One random no selected by random no table
which is equal or less than SI= Random start (RS)
• The village/ town have cumulative population
equals/exceeds the particular selected RS is 1st
cluster
• 1st cluster= Random start
• 2nd cluster=RS+(1*SI)
• 3rd Cluster= RS + (2*SI)
• 4th Cluster = RS+ (3*SI)
• ………
• 10th Cluster= RS+ (9*SI)
Selection of individual/household
• 1. simple one stage cluster sample: 1st stage:
cluster selected. 2nd stage: all units are selected
• 2. simple two stage cluster sample: 1st stage:
cluster selected, 2nd stage: simple/SRS
• 3. multi stage sample: more than 2 stage
involved.
1st stage: cluster selected
2nd stage: stratified clusters
3rd stage: simple/SRS
Immunization coverage survey
• Children between 12-23 months are covered
in each cluster
• Survey continued until 7 children found
• Total no of fully immunized children:
7*30=210
• If all children found then immunization
coverage= 210/210*100=100%
• If say 150 found then 150/210*100= 71.4%
Multistage sampling
• Carried out in several stages, in large country
survey ( anemia /hook worm survey)
• Any type of probability sampling technique
can be applied at each stage
• India: 5 states: 3 districts: 2 blocks
• Reduces the work load
Multiphase sampling
• Part of information is collected from whole
sample and part from the subsample
• 20 fever cases clinical examination+ basic
blood tests high ESR widal/ MP test
• Less costly/less laborious
Lot quality assurance sampling(LQAS)
• The technique was developed in 1920s to control
quality of output in industrial production
processes.
• In health sector to identify communities with
unacceptably low immunization, worrying level of
disease prevalence etc.
• Does not give the exact prevalence but
probability that particular area has inadequate
level of immunization or high disease prevalence
• Whole district =supervision unit
• Each community= supervision area
• Minimum of 19 items from each supervision
area is chosen (acceptable error)
• Sample size of all supervision area =95 or
more
• 5-6 supervision area is ideal
• Can be used to assess binary outcomes only
• Expressed as % of clients who received a
service in a defined period of time.
• Good= maintain program at current level,
identify best practices to help other programs
• Below average= identify reasons, develop
solutions
• Advantage/disadvantage
• SAMPLING BIAS: unless the sampling method ensures all
members of universe have a chance of selection into
sample bias is possible. Best way to avoid is to use
probability sampling.
• DESIGN EFFECT: is a coefficient which reflects how
sampling design affects the computation of significance
levels compared to simple random sampling .
• A design effect coefficient of 1.0 means the sampling
design is equivalent to simple random sampling.
• A design effect greater than 1.0 means the sampling
design reduces precision of estimate compared to simple
random sampling (cluster sampling).
• A design effect less than 1.0 means the sampling design
increases precision compared to simple random sampling
(stratified sampling).
True/false
• In 30*7 cluster sampling 210 children are
surveyed
• Sample is a part of universe
• Stratified random sampling is applicable in
heterogeneous population
• Sample size in cluster sampling is less than to
simple random sampling
• Simple random sampling is used for scattered
heterogeneous population
True about simple random sampling
1. Every person has an equal chance of
selection
2. Less no. of sample is obtained
3. Also known as systematic random sampling
4. Groups are not equally distributed
For a survey, a village is divided into 5
lanes, then each lane sampled
randomly is an example of:
1. simple random sampling
2. Systematic random sampling
3. Stratified random sampling
4. All of the above
Which is true of cluster sampling:
1. Every nth case is chosen for the study
2. A natural group is taken as sampling unit
3. Stratification of population has been done
4. Involves use of random number
Immunization status in an area is
checked by
1. Simple random sampling
2. Systematic random sampling
3. Stratified random sampling
4. Cluster sampling
In a community of 3000 people, 80%
Hindu, 10% Muslim, 5% Sikh, 4%
Christians and 1% Jain. To select a
sample of 300 people to analyze food
habits, ideal sampling would be
1. Simple random
2. Stratified random
3. Systematic random
4. Cluster
True about simple random sampling is
1. Each person has a known and equal chance
of being selected
2. Number 2 consecutive members are selected
3. Error most frequent
4. Adjacent samples should not be chosen
The cluster sampling technique used in
evaluation of UIP coverage
1. 20cluster 5 children
2. 30 cluster 5 children
3. 30 cluster 7 children
4. 30 cluster 10 children
In a village every 5th house was
selected for study. This is which
type of sampling
1. Simple random
2. Systematic random
3. Stratified random
4. Any of the above
When part of information collected
from whole sample and part from sub
sample, it is called
1. Simple random
2. Cluster
3. Multiphasic sampling
4. Multistagic sampling
All are example of probability sampling
except
1. Cluster
2. Convenience
3. Sequential
4. Stratified random
Sampling and sampling technique

Sampling and sampling technique

  • 1.
    Sampling and sampling techniques Dr.Moumita Pal MBBS,DPH, MD Dept. of Community Medicine College of Medicine and Sagore Dutta Hospital
  • 2.
    Sampling It is processof choosing a representative sample from a target population and collecting data from that sample in order to understand something about the population as a whole.
  • 3.
    • Universe (Whole population): entire group of the study population is known as universe or whole population. Represents the complete set of individuals, objects or scores in which we are interested. • Sampling unit: each member of the whole population • Sampling frame: a list where all individuals from the whole population are drawn up is known as sampling frame. • Sample: a small representative part of the whole population
  • 4.
    Why sampling? • Tocut down financial cost for data collection, processing and reporting. • To cut short the time and resources • Information collected from a sample is accurate i.e. valid and reliable
  • 5.
    Target population entirepopulation Adult population Study population subset of target population in field practice area Study subject sample drawn from study population selected adults
  • 6.
    Sample size calculation •For descriptive study • For analytical study
  • 7.
    Sampling technique • 1.Probability • 2. Non probability
  • 8.
    Non probability • Sampleselected deliberately by the researcher on his own choice • 1. Purposive sampling (judgmental sampling): participants are purposively selected from whom information can be obtained easily. • 2. Convenience sampling: participants are selected on the basis of easy accessibility. • 3. Self selection sampling: participants take part in the research on their own as a volunteer
  • 9.
    • 4. Snowballsampling: ( network sampling) : in this process one study subject is asked to identify persons with the same exposure in question for the purpose of finding the next subjects. (When target population is hidden eg: HIV/AIDS, drug addict, sex worker etc.) • 5. Quota sampling: researcher are given quotas to fill from different strata of population keeping the proportions of quota same as observed in the population. Hindu, Muslim - 60%, 40 %, can choose participants by his own choice in same ration
  • 10.
    Probability sampling • Superiorto non probability sampling. • Obeys law of probability and base on concept of random selection • Known as random sampling or chance sampling
  • 11.
    Simple random sampling •Each member of population has an equal chance of being chosen ( guarantees the sample chosen is representative of the population) • Applicable when sample size is small homogeneous and readily available. • Complete list of population must be available as sampling frame • First all sampling units are assigned with numbers • Then sample can be selected by random number table or lottery method.
  • 12.
    Systematic random sampling •For large scattered heterogeneous population • All sampling units are assigned with number • A random starting point chosen 1st then every nth number has been chosen. • n is sample interval= total population/sample size • 1st unit as random and others as systematic nth unit
  • 13.
    Stratified random sampling •For heterogeneous population ( when we want to know distribution according to particular variable) • 1st heterogeneous group divided into small homogeneous groups: called Strata • From each group required number of sample units taken by simple or systematic random sampling in proportion to its original size • Strata should be mutually exclusive and exhaustive
  • 14.
    Cluster sampling • Dividingthe population of interest into geographically distinct groups/clusters • Used when units of population are natural groups or clusters like blocks, wards, villages, slums etc. If related to geographical area: called Area sampling • The 30 cluster sampling technique: 30*7 sample developed by WHO • From list of all cluster select 30 clusters= 1st step • Selection of 7 interview site= 2nd step 2 stage sampling • Primary sampling unit/secondary sampling unit
  • 15.
    • Used forevaluation of immunization coverage of districts, attitude of people towards immunization, contraception, intervention program etc • ADVANTAGES: for a large geographical area where list of household is not there, time saving, less costly, sample size is less • DISADVANTAGES: gives higher standard error than other sampling design
  • 16.
    • Selection ofcluster from primary sampling units : • 1. simple/systematic random sampling • 2. probability proportionate to population size (PPS):
  • 17.
    Probability proportionate to populationsize (PPS): • List of village, town or wards with respective population/household numbers prepared • Say among 30 clusters 10 cluster has to be taken • Cumulative population of 30 cluster calculated and divided by 10.= sampling interval (SI) • One random no selected by random no table which is equal or less than SI= Random start (RS) • The village/ town have cumulative population equals/exceeds the particular selected RS is 1st cluster
  • 18.
    • 1st cluster=Random start • 2nd cluster=RS+(1*SI) • 3rd Cluster= RS + (2*SI) • 4th Cluster = RS+ (3*SI) • ……… • 10th Cluster= RS+ (9*SI)
  • 21.
    Selection of individual/household •1. simple one stage cluster sample: 1st stage: cluster selected. 2nd stage: all units are selected • 2. simple two stage cluster sample: 1st stage: cluster selected, 2nd stage: simple/SRS • 3. multi stage sample: more than 2 stage involved. 1st stage: cluster selected 2nd stage: stratified clusters 3rd stage: simple/SRS
  • 22.
    Immunization coverage survey •Children between 12-23 months are covered in each cluster • Survey continued until 7 children found • Total no of fully immunized children: 7*30=210 • If all children found then immunization coverage= 210/210*100=100% • If say 150 found then 150/210*100= 71.4%
  • 23.
    Multistage sampling • Carriedout in several stages, in large country survey ( anemia /hook worm survey) • Any type of probability sampling technique can be applied at each stage • India: 5 states: 3 districts: 2 blocks • Reduces the work load
  • 24.
    Multiphase sampling • Partof information is collected from whole sample and part from the subsample • 20 fever cases clinical examination+ basic blood tests high ESR widal/ MP test • Less costly/less laborious
  • 25.
    Lot quality assurancesampling(LQAS) • The technique was developed in 1920s to control quality of output in industrial production processes. • In health sector to identify communities with unacceptably low immunization, worrying level of disease prevalence etc. • Does not give the exact prevalence but probability that particular area has inadequate level of immunization or high disease prevalence
  • 26.
    • Whole district=supervision unit • Each community= supervision area • Minimum of 19 items from each supervision area is chosen (acceptable error) • Sample size of all supervision area =95 or more • 5-6 supervision area is ideal
  • 27.
    • Can beused to assess binary outcomes only • Expressed as % of clients who received a service in a defined period of time. • Good= maintain program at current level, identify best practices to help other programs • Below average= identify reasons, develop solutions • Advantage/disadvantage
  • 28.
    • SAMPLING BIAS:unless the sampling method ensures all members of universe have a chance of selection into sample bias is possible. Best way to avoid is to use probability sampling. • DESIGN EFFECT: is a coefficient which reflects how sampling design affects the computation of significance levels compared to simple random sampling . • A design effect coefficient of 1.0 means the sampling design is equivalent to simple random sampling. • A design effect greater than 1.0 means the sampling design reduces precision of estimate compared to simple random sampling (cluster sampling). • A design effect less than 1.0 means the sampling design increases precision compared to simple random sampling (stratified sampling).
  • 29.
    True/false • In 30*7cluster sampling 210 children are surveyed • Sample is a part of universe • Stratified random sampling is applicable in heterogeneous population • Sample size in cluster sampling is less than to simple random sampling • Simple random sampling is used for scattered heterogeneous population
  • 30.
    True about simplerandom sampling 1. Every person has an equal chance of selection 2. Less no. of sample is obtained 3. Also known as systematic random sampling 4. Groups are not equally distributed
  • 31.
    For a survey,a village is divided into 5 lanes, then each lane sampled randomly is an example of: 1. simple random sampling 2. Systematic random sampling 3. Stratified random sampling 4. All of the above
  • 32.
    Which is trueof cluster sampling: 1. Every nth case is chosen for the study 2. A natural group is taken as sampling unit 3. Stratification of population has been done 4. Involves use of random number
  • 33.
    Immunization status inan area is checked by 1. Simple random sampling 2. Systematic random sampling 3. Stratified random sampling 4. Cluster sampling
  • 34.
    In a communityof 3000 people, 80% Hindu, 10% Muslim, 5% Sikh, 4% Christians and 1% Jain. To select a sample of 300 people to analyze food habits, ideal sampling would be 1. Simple random 2. Stratified random 3. Systematic random 4. Cluster
  • 35.
    True about simplerandom sampling is 1. Each person has a known and equal chance of being selected 2. Number 2 consecutive members are selected 3. Error most frequent 4. Adjacent samples should not be chosen
  • 36.
    The cluster samplingtechnique used in evaluation of UIP coverage 1. 20cluster 5 children 2. 30 cluster 5 children 3. 30 cluster 7 children 4. 30 cluster 10 children
  • 37.
    In a villageevery 5th house was selected for study. This is which type of sampling 1. Simple random 2. Systematic random 3. Stratified random 4. Any of the above
  • 38.
    When part ofinformation collected from whole sample and part from sub sample, it is called 1. Simple random 2. Cluster 3. Multiphasic sampling 4. Multistagic sampling
  • 39.
    All are exampleof probability sampling except 1. Cluster 2. Convenience 3. Sequential 4. Stratified random