COMPARISON OF SAMPLING STRATEGIES
Type and Definition
Description of Steps
Advantages
Disadvantages
I. Simple random sampling—when each individual in a defined population has an equal and independent chance of being selected into the sample.
1. Assign to each member of population a unique number.
2. Select via use of random numbers (random number table, dice, computer, etc.) the sample members in a sufficient number
1. Maximum external validity, assuming reasonably small refusal rate.
2. Requires minimum knowledge of the population characteristics in advance
3. Free of possible classification errors
4. Very simple to implement
5. Easy to analyze data & compute error
1. Researcher must complete population list (often difficult)
2. Doesn’t use knowledge of population researcher may have
3. For same sample size, produces larger sampling error compared to stratified random sampling.
II. Systematic random sampling—when each individual in a defined population has an equal (but not independent) chance of being selected into the sample
1.Compute sampling interval r=N/n, where
N=number in population
n=number needed in sample
round up to an integer
2. Randomly select a start #
3. Select every rth individual
1.Maximum external validity, assuming no ordering in the list or file of names
2. Very simple/quicker than Simple random sampling because there is no need for a numbered list
3. Easy to analyze data & compute error
1. If sampling interval is related to a periodic order, increased variability may be introduced
2. Estimates of errors likely to be high where there is an order
3. May produce errors if N is miscalculated initially.
III. Multistage random sampling—when each individual in randomly sampled units have an equal chance of being selected into the sample.
1. Use random sampling (I or II) to select some sampling units (companies, schools, classes, etc.)
2. Use random sampling (I or II) to select individuals from each sampling unit.
1.Sampling lists, identification, numbering are required only for members in sampling units; especially advantageous with large or difficult-to-enumerate populations
2. If sampling units are geographically defined, this reduces data collection costs
3. High external validity
1. Sampling error larger than I or II for same sample size
2. Sampling error increases as number of units sampled in first stage decreases.
IV. Stratified random sampling—
a. Proportionate:
when each individual in purposively defined strata has an equal and independent chance of being selected into the sample
1. Divide population list into strata on the basis of their relevant characteristic(s)
2. Randomly select from each stratum a number of sample members proportionate to the size of each stratum
1. Assures representativeness of sample with respect to stratification variable
2. Decreases chance of failure to have a sufficient number of a subgroup(s) needed for desired analysis
3. Less extraneous variability than I-III.
4. M.
COMPARISON OF SAMPLING STRATEGIESType and DefinitionDescriptio.docx
1. COMPARISON OF SAMPLING STRATEGIES
Type and Definition
Description of Steps
Advantages
Disadvantages
I. Simple random sampling—when each individual in a defined
population has an equal and independent chance of being
selected into the sample.
1. Assign to each member of population a unique number.
2. Select via use of random numbers (random number table,
dice, computer, etc.) the sample members in a sufficient number
1. Maximum external validity, assuming reasonably small
refusal rate.
2. Requires minimum knowledge of the population
characteristics in advance
3. Free of possible classification errors
4. Very simple to implement
5. Easy to analyze data & compute error
1. Researcher must complete population list (often difficult)
2. Doesn’t use knowledge of population researcher may have
3. For same sample size, produces larger sampling error
compared to stratified random sampling.
II. Systematic random sampling—when each individual in a
defined population has an equal (but not independent) chance of
being selected into the sample
1.Compute sampling interval r=N/n, where
N=number in population
2. n=number needed in sample
round up to an integer
2. Randomly select a start #
3. Select every rth individual
1.Maximum external validity, assuming no ordering in the list
or file of names
2. Very simple/quicker than Simple random sampling because
there is no need for a numbered list
3. Easy to analyze data & compute error
1. If sampling interval is related to a periodic order, increased
variability may be introduced
2. Estimates of errors likely to be high where there is an order
3. May produce errors if N is miscalculated initially.
III. Multistage random sampling—when each individual in
randomly sampled units have an equal chance of being selected
into the sample.
1. Use random sampling (I or II) to select some sampling units
(companies, schools, classes, etc.)
2. Use random sampling (I or II) to select individuals from each
sampling unit.
1.Sampling lists, identification, numbering are required only for
members in sampling units; especially advantageous with large
or difficult-to-enumerate populations
2. If sampling units are geographically defined, this reduces
data collection costs
3. High external validity
1. Sampling error larger than I or II for same sample size
2. Sampling error increases as number of units sampled in first
stage decreases.
3. IV. Stratified random sampling—
a. Proportionate:
when each individual in purposively defined strata has an equal
and independent chance of being selected into the sample
1. Divide population list into strata on the basis of their relevant
characteristic(s)
2. Randomly select from each stratum a number of sample
members proportionate to the size of each stratum
1. Assures representativeness of sample with respect to
stratification variable
2. Decreases chance of failure to have a sufficient number of a
subgroup(s) needed for desired analysis
3. Less extraneous variability than I-III.
4. Medium high external validity
1. Requires accurate information on proportion of population in
each stratum; otherwise, increased error
2. May be costly, time-consuming to achieve stratified
population list
3. Possibility of faulty classification that creates higher random
variance
b. Disproportionate:
when each individual in purposively defined strata has an
unequal but random chance of being selected into the sample
1. Same as IV.a
2. Randomly select from each stratum a number of sample
members disproportionate to the sized of each stratum (i.e., one
or more strata “overrepresented”)
1. More efficient than IV.a for comparing across strata (fewer
total number required)
2. Assures having a sufficient number of a low incidence
subgroup of population
4. 3. Medium external validity
1. Same as IV.a on strata
2. Less efficient than IV.a for point estimates for entire
population
3. Must use sampling weights prior to statistical analysis; make
the data analysis more complex
Type and Definition
Description of Steps
Advantages
Disadvantages
V. Cluster (or area probability) sampling—
a. Simple
when each individual in randomly selected clusters have an
equal and independent chance of being selected into the sample
1. Randomly select clusters or geographical area (e.g., states,
counties, census tracts) by some form of random sampling (I or
II)
2. Include all members of each cluster in sample (i.e.,
enumeration)
1. Has lowest interviewer data collection costs of all probability
sampling methods
2. Requires listing of only individuals within the sampling
clusters (or areas) which reduces time and money costs
3. Characteristics of clusters can also be used in research/data
analysis (or cluster can be used as the unit of analysis)
1. Larger errors for comparable n than other probability samples
2. Requires unique assignment of each individual to exactly one
cluster; inability to do so results in duplication and/or omission
of individuals
3. Medium external validity
5. b. Stratified:
when each individual in randomly selected clusters and
purposively defined strata have an equal and independent
chance of being selected into the sample
1. Divide clusters into strata by stratum characteristics
2. Randomly select clusters from within each stratum
3. Include all members of each cluster in sample (i.e.,
enumeration)
1. Reduced variability compared to V.a; more efficient for
comparison by strata
2. Comes closer than V.a to assuring researcher the ability to
make relevant comparison of clusters across the different strata
1. Disadvantages of stratified added to those of the simple
cluster (compounds distance from simple random sampling)
2. Cluster properties may change after characteristics are
measured
3. Medium to low external validity
-----------------------------------------------
---------------------------------------------
----------------------------------------------
----------------------------------------------
VI. Quota sampling—
When only a predetermined proportion of a population with
only characteristic(s) specified have a chance of being selected
as a subject
1. Classify population members by some relevant variable(s)
2. Determine the proportion of sample desired with relevant
characteristic(s)
3. Fix a quota of subjects with desired characteristic(s) for each
observer/data collector
1. Reduces costs of obtaining sample members, and, perhaps,
data collection
6. 2. May introduces some stratification effect (but researcher
won’t know for sure until after data collection /analysis)
3. If third step is done randomly, this may make it more like
stratified sample (but, still can’t really be sure it is)
1. Variability and bias of estimates can’t be measured or
adjusted for
2. Possible bias of researchers’ misclassification of subjects
3. Introduces biases of nonrandom selection by observers /data
collectors that differ by observer
4. Low external validity
VII. Judgment or purposive sampling—
When only purposively selected individuals have a chance of
being selected as a subject
1. Select subgroup(s) of the defined population than on the basis
of best information is judged to be representative of the target
population
2. Enumerate, select, or recruit individuals from subgroup(s)
1. Reduces costs of obtaining sample members, and, perhaps
data collection since this is typically done where the subgroups
are geographically proximate
2. Quick
1. Variability and bias of estimates can’t be measured or
adjusted for
2. Requires strong assumptions about the population and its
subgroup(s)
3. Violates all assumptions of all statistical techniques
4. Very low external validity
VIII. Convenience/snowball /volunteer sampling—
7. When individuals become subjects by convenience, referral, or
by volunteering
Nor real method. Subjects are selected or recruited for
researchers’ convenience and minimization of cost and/or time
1. Reduces costs of obtaining sample members, and, perhaps,
data collection
2. Very quick
1. Violates all assumptions of all statistical techniques
2. No external validity; very high probability that sample is
NOT representative of any population
Adapted from Ackoff, R. L. (1953). The design of social
research. Chicago: University of Chicago Press.
CASE STUDY: Any Kind of Check Won’t Do
FACTS: In the 1990s, D. J. Rivera, a “financial advisor” and
Salvatore Guarino, a cohort of Rivera, sold John G. Talcott, Jr.,
a 93-year-old Massachusetts resident, an investment of $75,000.
The investment produced no returns. On January 10, 2000,
Rivera telephoned Talcott and talked him into sending him a
check for $10,000 made out to Guarino, which was to be used
for travel expenses to obtain a return on the original $75,000
investment. Rivera received the check on January 11. Talcott
spoke to Rivera on the morning of January 11. Rivera indicated
that $10,000 was more than what was needed for travel. He said
that $5,700 would meet the travel costs. Talcott called his bank
and stopped payment on the $10,000 check. Guarino went to
Any Kind’s Stuart, Florida, office (a place where he had
established checkcashing privileges) on January 11 and
presented the $10,000 check to Nancy Michael, a supervisor.
Guarino showed Michael his driver’s license and the Federal
Express envelope from Talcott in which he had received the
check. Based on her experience, Michael believed the check was
good; the Federal Express envelope was “very crucial” to her
decision because it indicated that the maker of the check had
sent it to the payee trying to cash the check. After deducting the
5 percent check cashing fee, Michael cashed the check and gave
8. Guarino $9,500. The next day she deposited the check in the
company’s bank. On January 15, 2000, Talcott sent a check for
$5,700. On January 17, 2000, Guarino went into the Stuart Any
Kind store and presented the $5,700 check to the teller, Joanne
Kochakian. Kochakian noticed that Michael had previously
approved the $10,000 check. She called Michael and told her
about Guarino’s check. Michael instructed the cashier not to
cash the check until she had contacted the maker, Talcott, to
obtain approval. Talcott approved cashing the $5,700 check.
There was no discussion of the $10,000 check. Any Kind cashed
the second check for Guarino, from which it deducted a 3
percent fee. On January 19, Rivera called Talcott to warn him
that Guarino was a cheat and a thief. Talcott immediately called
his bank and stopped payment on the $5,700 check. Talcott’s
daughter called Any Kind and told it of the stop payment on the
$5,700 check. Any Kind filed suit against Guarino and Talcott,
claiming that it was a holder in due course. The trial court
entered judgment for Any Kind for only the $5,700 check. The
court found that the circumstances surrounding the cashing of
the $10,000 check were suspicious and should have put Any
Kind on notice of a problem and that Any Kind was not a holder
in due course of that check. DECISION: The events and
circumstances were sufficient to put Any Kind on notice of
potential defenses. The circumstances of a person describing
himself as a broker, receiving funds in the amount of $10,000,
and negotiating the check for those funds at a $500 discount
were sufficient to put Any Kind on inquiry notice that some
confirmation or explanation should be obtained. Any Kind
should have approached the $10,000 check with additional
caution, beyond the FedEx envelope, and should have verified it
with the maker if it wanted to preserve its holder-indue-course
status. Affirmed.
Question: Write a 2 pages paper on whether or not you agree
with the Court’s decision. Is it fair? In your opinion, is Any
Kind a HDC?
9. FCS 681 Research Methods
Exercise 4: Sampling
1. A researcher plans a study of housing quality of low-income
households in Y County, CA. He needs a sample of 500
households to accomplish his purpose. He ascertains from the Y
county Housing Authority that there are 5,000 households living
in public housing (requiring low income for eligibility) in the
county. He obtains a list of these households’ names and
addresses, numbers them from 1 to 5,000, and chooses 500 of
these households using a computer-generated list of random
numbers. He tries to collect data from these 500 households via
a mailed questionnaire.
a. To what population does this researcher wish to generalize?
b. What is the sampling frame in this study?
c. What type of sampling does this researcher do (be precise)?
d. Describe the chance that each household in the sampling
frame has of ending up in the sample.
e. How well does this sampling frame reflect his stated
population? Why?
10. f. What could he do if he wanted to improve the external
validity of the research?
2. A researcher wants to study L.A. public university seniors’
career choices and plans to collect data from CSUN, UCLA, and
CSULA. The researcher receives permission to use the
registration records on the three campuses. CSUN has 8,000
seniors, UCLA has 10,000 seniors, and CSULA has 5,000
seniors. He needs 500 seniors in his study, so on each campus’s
list, he randomly picks a name to start with and then selects
each rth senior on the list until he has 500 seniors drawn.
a. What is the theoretical population to which the researcher
wishes to generalize?
b. What is the accessible population in this study?
c. What type of sampling plan does this study utilize (be
precise)?
11. d. What is the value of r in that the researcher should use (show
your work)?
e. What is the number of seniors that will be obtained from each
campus?
f. Describe the chance that each senior has of ending up in this
sample.
g. What is the advantage of this sampling plan over a simple
random sampling plan?
h. What must this researcher ascertain before he can be
reasonably confident that using this sampling plan will produce
a representative sample?
12. 3. A researcher wishes to investigate health conditions of the
elderly (aged 65+) householders in Z County, CA. Previous
research suggests that elders’ place of residence (metro (a.k.a.
urban) versus non-metro (rural)) is an important variable
affecting their health conditions. So, in 2008 from a county
map, she randomly selects census tracts[footnoteRef:1], of
which the county has 155 (100 metro and 55 non-metro as
defined in the U.S. Census Bureau 2000 census). She selects
10% of the “metro” tracts and 10% of the “non-metro” tracts.
Data collectors are sent to each selected tract and instructed to
interview each eligible householder within each tract until they
all have been interviewed. [1: A census tract, census area, or
census district is a particular community defined for the purpose
of taking a census. ]
a. What is the theoretical population to which this researcher
wishes to generalize?
b. What is the sampling frame in this study?
c. What type of sampling plan does this researcher utilize (be
precise)?
d. Describe the chance each elderly householder in this county
has of ending up in the sample.
e. In your opinion, are there any problem(s) of the sampling
13. plan?
4. Researchers wanted a sample of a state’s population of two-
parent families with exactly two children under 18 years old. An
important variable in their study was age of the younger child in
the family. The state has 100 counties; they randomly selected 5
of these from the list of the state’s counties. They conducted a
school census in these 5 counties, in which they measured the
number of parents in each child’s home, the number of children,
and the ages of the children in the family. They retained only
those children who had exactly two parents and one sibling.
They divided families into groups that had younger children of
five different ages: under 1, one year old, 2-5 years old, 5-11
years old, and 12-17 years old. Each of these lists had a
different number of families. They randomly selected 42 of the
families on each list (for a total sample of 210 families).
a. What was the theoretical population to which the researchers
wanted to generalize?
14. b. What were the sampling frames in this study?
c. What type of sampling did they use (describe it as completely
as possible)?
d. What is the age of the younger child variable called?
e. Why do you think these researchers did not plan a simple
random sampling plan?
15. 5. A team of researcher wants to analyze the incidence of
unresolvable car repair complaints among California consumers
in the past 12 months. They contact the California Department
of Consumer Affairs and obtain a list of the problems that CA
consumers complained to their office about in the past 12
months. They divide the complaints into those that relate to
automobile repairs and all others. Then they use the automobile
repair complaints as their sample.
a. What type of sampling do these researchers use?
b. Evaluate the external validity of this sampling plan, given the
researchers’ purpose.