Sampling
Chapter 6
*
Introduction
Sampling is the process of selecting observations
Often not possible to collect information from all units you wish to study
Often not necessary to collect data from everyone out there
Allows researcher to make a small subset of observations and then generalize to the rest of the population
The Logic of Probability Sampling
Samples: a group of subjects selected from a population
Probability sampling: a method of selection in which each member of a population has a known chance of being selected
Enables us to generalize findings from observing cases to a larger unobserved population
Because we are not completely homogeneous, our sample must be representative of the variations that exist among us
Conscious and Unconscious Sampling Bias
Be conscious of bias – when sample is not fully representative of the larger population from which it was selected
Sampling bias is not always obvious
Use techniques to help avoid bias
Representativeness and Probability of SelectionA sample is representative of the population from which it is selected if the aggregate characteristics of the sample closely approximate the same aggregate characteristics in the populationSamples that are representative of the population are often labeled equal probability of section method (EPSEM) samples because all members of the population have an equal chance of being included in the sample
Sampling Terminology 1
Sample Element: who or what are we studying (student)
Population: whole group (college freshmen)
Population Parameter: summary description of a given variable in a population
Sample Statistic: summary description of a given variable in a sample; we use sample statistics to make estimates or inferences of population parameters
Sampling Terminology 2Sampling distribution: a range of sample statistics we obtain if we select many samples from a populationSampling frame: actual list of units to be selected (our school’s enrollment list)Binomial variable: a variable with only two values
Sampling Terminology 3
Standard error: a measure of sampling error; we can estimate the degree to be expected
Confidence Levels and Confidence Intervals
Two key components of sampling error
We express the accuracy of our sample statistics in terms of a level of confidence that the statistics fall within a specified interval from the parameter
Sampling Designs 1
Simple Random Sampling: each element in a sampling frame is assigned a number, choices are then made through random number generation as to which elements will be included in your sample
Systematic Sampling: elements in the total list are chosen (systematically) for inclusion in the sample
List of 10,000 elements, we want a sample of 1,000, select every tenth element
Choose first element randomly
Sampling Designs 2
Stratification: modification to random and systematic sampling; ensures that appropriate numbers are drawn from homogeneous subsets of that population
Dis.
General Principles of Intellectual Property: Concepts of Intellectual Proper...
SamplingChapter 6IntroductionSampling.docx
1. Sampling
Chapter 6
*
Introduction
Sampling is the process of selecting observations
Often not possible to collect information from all units you
wish to study
Often not necessary to collect data from everyone out there
Allows researcher to make a small subset of observations and
then generalize to the rest of the population
The Logic of Probability Sampling
Samples: a group of subjects selected from a population
Probability sampling: a method of selection in which each
member of a population has a known chance of being selected
Enables us to generalize findings from observing cases to a
larger unobserved population
Because we are not completely homogeneous, our sample must
be representative of the variations that exist among us
2. Conscious and Unconscious Sampling Bias
Be conscious of bias – when sample is not fully representative
of the larger population from which it was selected
Sampling bias is not always obvious
Use techniques to help avoid bias
Representativeness and Probability of SelectionA sample is
representative of the population from which it is selected if the
aggregate characteristics of the sample closely approximate the
same aggregate characteristics in the populationSamples that are
representative of the population are often labeled equal
probability of section method (EPSEM) samples because all
members of the population have an equal chance of being
included in the sample
Sampling Terminology 1
Sample Element: who or what are we studying (student)
Population: whole group (college freshmen)
Population Parameter: summary description of a given variable
in a population
Sample Statistic: summary description of a given variable in a
sample; we use sample statistics to make estimates or inferences
of population parameters
Sampling Terminology 2Sampling distribution: a range of
sample statistics we obtain if we select many samples from a
populationSampling frame: actual list of units to be selected
(our school’s enrollment list)Binomial variable: a variable with
only two values
3. Sampling Terminology 3
Standard error: a measure of sampling error; we can estimate
the degree to be expected
Confidence Levels and Confidence Intervals
Two key components of sampling error
We express the accuracy of our sample statistics in terms of a
level of confidence that the statistics fall within a specified
interval from the parameter
Sampling Designs 1
Simple Random Sampling: each element in a sampling frame is
assigned a number, choices are then made through random
number generation as to which elements will be included in
your sample
Systematic Sampling: elements in the total list are chosen
(systematically) for inclusion in the sample
List of 10,000 elements, we want a sample of 1,000, select
every tenth element
Choose first element randomly
Sampling Designs 2
Stratification: modification to random and systematic sampling;
ensures that appropriate numbers are drawn from homogeneous
subsets of that population
Disproportionate stratified sampling: way of obtaining
sufficient number of rare cases by selecting a disproportionate
4. number
Cluster sampling: compile a stratified group (cluster), sample it,
then subsample that set
This process can go on for many cluster levels
National Crime Victimization Survey
Seeks to represent the nationwide population of persons 12+
living in households (≈ 42K units, 74K occupants in 2004)
First defined are primary sampling units (PSUs)
Largest are automatically included, smaller ones are stratified
by size, population density, reported crimes, and other variables
into about 150 strata
Census enumeration districts are selected (CED)
Clusters of 4 housing units from each CED are selected
British Crime Survey
First stage – 289 Parliamentary constituencies, stratified by
geographic area and population density
Two sample points were selected, which were divided into four
segments with equal #’s of delivery addresses
One of these four segments was selected at random, then
disproportionate sampling was conducted to obtain a greater
number of inner-city respondents
Household residents aged 16+ were listed, and one was
randomly selected by interviewers (n=37,213 in 2004)
Nonprobability Sampling
Nonprobability Sampling: the likelihood any element will be
include in the sample is unknown
Purposive sampling: selecting a sample on the basis of your
5. judgment and the purpose of the study
Quota sampling: units are selected so that total sample has the
same distribution of characteristics as are assumed to exist in
the population being studied
Snowball sampling: You interview some individuals, and then
ask them to identify others who will participate in the study,
who ask others…etc., etc.
Example: Snowball Sampling
To study cannabis users, Hammersley and Leon (2006) gathered
a snowball sample of 176 University students who had used
marijuana at least once. Extensive interviews were then
conducted with the University students in the sample. Their
results showed that there were two types of users—those who
used cannabis on a regular basis and those who used cannabis
on occasion. The results also showed that users experienced
both positive and negative effects from using marijuana and the
patterns of use were more similar to patterns of alcohol and
tobacco use than to patterns of controlled substance use.
Hammersley, R. & Leon, V. (2006). Patterns of cannabis use
and positive and negative experiences of use amongst university
students. Addiction Research and Theory, 14(2), 189-205.