1
Selection bias
• Selection bias
– Bias resulting from systematic error in
ascertainment or participation of study subjects
– People having different probabilities of being
included in the study sample based on exposure
and outcome
2
Selection bias
• Selection bias
– Generally, when both exposure and outcome affect
selection/participation, the exposure-outcome
association in the sample is no longer
representative of that association in the source
population
3
Selection bias
• Selection bias in case-control studies
– Cases are generally more likely to participate in
case-control studies than controls – the fact that
they have a disease increases motivation to
participate
– The bias arises when exposure status also affects
probability of participation in a case-control study
– If among either the case or the control subjects,
exposure makes subjects more or less likely to
participate in the study, then selection bias is
introduced
Selection bias
• Illustration of case-control selection bias with 2x2 table
4
5
Selection bias
• Selection bias in cohort studies
– Selection bias related to differential participation by
exposure and disease status is less likely to occur in
cohort studies because subjects are selected and
enrolled prior to disease onset
– If exposure status does affect selection into the
study it is still unlikely that (future) outcome will
affect selection
Selection bias
• Selection bias in cohort studies
– Example: a concurrent occupational cohort study
comparing textile factory floor workers with workers
in less polluted parts of the factory in which textile
factory floor workers are more likely to participate
90
7
Selection bias
• Selection bias in cohort studies
– Selection bias is still an issue in cohort studies
8
Selection bias
• Selection bias in cohort studies
– Differential loss to follow-up is a substantial concern
in cohort studies
• It is sometimes discussed as information bias,
sometimes as selection bias
• If the outcome or some factor associated with the
outcome affects the probability of being lost, and
exposure affects probability being lost as well, then
selection bias will be present
9
Selection bias
• Selection bias in cohort studies
– Differential loss to follow-up has the same practical
effect as differential selection into the study—it
distorts the association of interest in the sample,
relative to the source population
– This makes the exposure-outcome association in
the study systematically different from the
association in the source population—biased
Selection bias
• Illustration of cohort loss to follow-up with 2x2 table
1
0
Selection bias
• Illustration of cohort loss to follow-up with 2x2 table
1
1
Selection bias
• Some specific selection biases
– Berkson’s bias: occurs in case-control
studies of hospitalized patients
1
2
1
3
Selection bias
– Berkson’s bias:
– If both exposure and outcome affect
hospitalization (and therefore selection into
the study) a statistical association between
exposure and outcome will be induced in the
hospitalized population
– Particular concern when the exposure and
the outcome are both health conditions that
can cause hospitalization (e.g., studies of
the effects of hypertension on skin cancer)
Selection bias
• Illustration of Berkson’s bias:
– Joint distribution of E and D in the general
population
– RR?
– RR = (25/50)/(25/50) = 1
1
4
1
5
Selection bias
• Hospitalization
– Everyone exposed is hospitalized
• Example: HTN as exposure
– Everyone with outcome is hospitalized
• Example: skin cancer as outcome
– Fewer of those with neither exposure nor
outcome are hospitalized
Selection bias
• Illustration of Berkson’s bias:
– Joint distribution of E and D in the hospitalized
population
– RR?
– RR = (25/50)/(25/30) = 0.6
100
Selection bias
• Some specific selection biases
– Healthy worker effect: occurs in occupational cohort
studies or other studies comparing working to non-
working or general population groups
1
7
1
8
Selection bias
– Healthy worker effect:
– Workers have lower rates of disease than
comparison cohorts from the general population
• Self-selection of hardier workers into more taxing jobs
• Attrition of sick workers from the workforce
– Comparison of occupational cohort of workers to
non-workers, general population groups, or workers
in less taxing jobs may cause the health effects of
occupational exposures to be under-estimated
1
9
Selection bias
• Some specific selection biases
– Self-selection bias: occurs in all studies in which
individuals decide whether to participate
– When people who choose to enroll in studies are
different from those who do not, selection bias may
be introduced
– This becomes more problematic as response
rate/participation rate decreases
Selection bias
• Numerical examples of selection bias
2
0
Selection bias
• Hypothetical population with true exposure and
disease status
21
S
z
k
l
o
T
a
b
l
e
4
-
1
Selection bias
• Hypothetical population with true exposure and
disease status
• If everyone were included in a study, sampling
fraction would 1.0 for all cells
22
S
z
k
l
o
T
a
b
l
e
4
-
1
Selection bias
• Hypothetical population with true exposure and
disease status
• OR?
• OR = (500/1800)/(500/7200) = 4
• OR = oddse/oddsu
23
S
z
k
l
o
T
a
b
l
e
4
-
1
Selection bias
exposed
β
δ
c
a
d
b
2
4
If b/β = d/δ then the exposure
distribution in the controls represents
the exposure distribution in the total
non-diseased population
Selection bias
exposed
PTe d
b
c
a
PTu
If b/PTe = d/PTu then the exposure
distribution in the controls represents
the exposure distribution in the total
cohort person time
2
5
Selection bias
exposed
c
a
d
b
110
If b/Ne = d/Nu then the exposure
distribution in the controls represents
the exposure distribution in the total
population at baseline
Nu
Ne
Selection bias
• Sampling fraction 1.0 for cases, 0.1 for controls
(example modified from Szklo)
• OR?
• OR = (500/180)/(500/720) = 4
• OR = pseudo-oddse/pseudo-oddsu
2
7
2
8
Selection bias
• Selection bias will be present in the sample if
exposure distribution in the cases or controls is
not the same as it is in the diseased and non-
diseased in the study base
• This occurs if the sampling fraction is different
depending on exposure for cases and/or
controls
Selection bias
• Sampling fraction different by exposure for
controls
• OR?
• OR = (500/324)/(500/576) = 1.8
2
9
Selection bias
• Sampling fraction different by exposure for
cases
• OR?
• OR = (500/180)/(400/720) = 5.0
3
0
3
1
Selection bias
• Different selection biases can end up canceling
each other out theoretically, however in
practice it’s impossible to know whether the
biases cancel
• Biases cancelled or equalized if magnitude of
bias in selection of cases same as in selection
of controls
Selection bias
• Sampling fraction differential by exposure for
cases and controls – same magnitude
• OR?
• OR = (500/180)/(400/576) = 4.0
3
2
Selection bias
• Selection fractions cancel in OR calculation
• OR = [(a*1)/(β*0.1)]/[(c*0.8)/(δ*0.08)]
• OR = [10(a/β)]/[10(c/δ)]
• OR = (a/β)/(c/δ)
3
3
3
4
Selection bias
• If selection/participation/loss is differential only
with respect to disease
– Biased:
•
•
•
Measures of disease
AR, CIR, IDR
– Unbiased:
OR
• If selection/participation/loss is differential with
respect to exposure and disease, all measures
of disease and association will be biased
– Excepting the rare situations in which the biases cancel
3
5
Selection bias
• Selection biases types summary
– Self selection bias
– Case-control studies
• Berkson’s bias
• People with particular exposure/disease
combinations more or less likely to participate
– Cohort studies
• Loss to follow-up
• Healthy worker effect
• People with particular exposure/(future)disease
combinations more or less likely to participate or
to be lost to follow-up
120
• A case-control study of postmenopausal
hormone therapy and endometrial cancer
• Cases and controls are recruited from same
medical practice
• Women with endometrial cancer are more likely
to have symptoms that lead them to visit doctor
• Women taking postmenopausal hormone
therapy are more likely to have symptoms that
lead them to visit doctor
• Women are enrolled in the study from the
medical practice
Selection bias
• Selection factor is visiting the medical practice
where study recruitment is being carried out
• Alters any true association between HRT and
endometrial cancer
HRT
Endometrial
cancer
Visit medical
practice,
recruited for
study
Selection bias
3
7

3.5.2 selection bias

  • 1.
    1 Selection bias • Selectionbias – Bias resulting from systematic error in ascertainment or participation of study subjects – People having different probabilities of being included in the study sample based on exposure and outcome
  • 2.
    2 Selection bias • Selectionbias – Generally, when both exposure and outcome affect selection/participation, the exposure-outcome association in the sample is no longer representative of that association in the source population
  • 3.
    3 Selection bias • Selectionbias in case-control studies – Cases are generally more likely to participate in case-control studies than controls – the fact that they have a disease increases motivation to participate – The bias arises when exposure status also affects probability of participation in a case-control study – If among either the case or the control subjects, exposure makes subjects more or less likely to participate in the study, then selection bias is introduced
  • 4.
    Selection bias • Illustrationof case-control selection bias with 2x2 table 4
  • 5.
    5 Selection bias • Selectionbias in cohort studies – Selection bias related to differential participation by exposure and disease status is less likely to occur in cohort studies because subjects are selected and enrolled prior to disease onset – If exposure status does affect selection into the study it is still unlikely that (future) outcome will affect selection
  • 6.
    Selection bias • Selectionbias in cohort studies – Example: a concurrent occupational cohort study comparing textile factory floor workers with workers in less polluted parts of the factory in which textile factory floor workers are more likely to participate 90
  • 7.
    7 Selection bias • Selectionbias in cohort studies – Selection bias is still an issue in cohort studies
  • 8.
    8 Selection bias • Selectionbias in cohort studies – Differential loss to follow-up is a substantial concern in cohort studies • It is sometimes discussed as information bias, sometimes as selection bias • If the outcome or some factor associated with the outcome affects the probability of being lost, and exposure affects probability being lost as well, then selection bias will be present
  • 9.
    9 Selection bias • Selectionbias in cohort studies – Differential loss to follow-up has the same practical effect as differential selection into the study—it distorts the association of interest in the sample, relative to the source population – This makes the exposure-outcome association in the study systematically different from the association in the source population—biased
  • 10.
    Selection bias • Illustrationof cohort loss to follow-up with 2x2 table 1 0
  • 11.
    Selection bias • Illustrationof cohort loss to follow-up with 2x2 table 1 1
  • 12.
    Selection bias • Somespecific selection biases – Berkson’s bias: occurs in case-control studies of hospitalized patients 1 2
  • 13.
    1 3 Selection bias – Berkson’sbias: – If both exposure and outcome affect hospitalization (and therefore selection into the study) a statistical association between exposure and outcome will be induced in the hospitalized population – Particular concern when the exposure and the outcome are both health conditions that can cause hospitalization (e.g., studies of the effects of hypertension on skin cancer)
  • 14.
    Selection bias • Illustrationof Berkson’s bias: – Joint distribution of E and D in the general population – RR? – RR = (25/50)/(25/50) = 1 1 4
  • 15.
    1 5 Selection bias • Hospitalization –Everyone exposed is hospitalized • Example: HTN as exposure – Everyone with outcome is hospitalized • Example: skin cancer as outcome – Fewer of those with neither exposure nor outcome are hospitalized
  • 16.
    Selection bias • Illustrationof Berkson’s bias: – Joint distribution of E and D in the hospitalized population – RR? – RR = (25/50)/(25/30) = 0.6 100
  • 17.
    Selection bias • Somespecific selection biases – Healthy worker effect: occurs in occupational cohort studies or other studies comparing working to non- working or general population groups 1 7
  • 18.
    1 8 Selection bias – Healthyworker effect: – Workers have lower rates of disease than comparison cohorts from the general population • Self-selection of hardier workers into more taxing jobs • Attrition of sick workers from the workforce – Comparison of occupational cohort of workers to non-workers, general population groups, or workers in less taxing jobs may cause the health effects of occupational exposures to be under-estimated
  • 19.
    1 9 Selection bias • Somespecific selection biases – Self-selection bias: occurs in all studies in which individuals decide whether to participate – When people who choose to enroll in studies are different from those who do not, selection bias may be introduced – This becomes more problematic as response rate/participation rate decreases
  • 20.
    Selection bias • Numericalexamples of selection bias 2 0
  • 21.
    Selection bias • Hypotheticalpopulation with true exposure and disease status 21 S z k l o T a b l e 4 - 1
  • 22.
    Selection bias • Hypotheticalpopulation with true exposure and disease status • If everyone were included in a study, sampling fraction would 1.0 for all cells 22 S z k l o T a b l e 4 - 1
  • 23.
    Selection bias • Hypotheticalpopulation with true exposure and disease status • OR? • OR = (500/1800)/(500/7200) = 4 • OR = oddse/oddsu 23 S z k l o T a b l e 4 - 1
  • 24.
    Selection bias exposed β δ c a d b 2 4 If b/β= d/δ then the exposure distribution in the controls represents the exposure distribution in the total non-diseased population
  • 25.
    Selection bias exposed PTe d b c a PTu Ifb/PTe = d/PTu then the exposure distribution in the controls represents the exposure distribution in the total cohort person time 2 5
  • 26.
    Selection bias exposed c a d b 110 If b/Ne= d/Nu then the exposure distribution in the controls represents the exposure distribution in the total population at baseline Nu Ne
  • 27.
    Selection bias • Samplingfraction 1.0 for cases, 0.1 for controls (example modified from Szklo) • OR? • OR = (500/180)/(500/720) = 4 • OR = pseudo-oddse/pseudo-oddsu 2 7
  • 28.
    2 8 Selection bias • Selectionbias will be present in the sample if exposure distribution in the cases or controls is not the same as it is in the diseased and non- diseased in the study base • This occurs if the sampling fraction is different depending on exposure for cases and/or controls
  • 29.
    Selection bias • Samplingfraction different by exposure for controls • OR? • OR = (500/324)/(500/576) = 1.8 2 9
  • 30.
    Selection bias • Samplingfraction different by exposure for cases • OR? • OR = (500/180)/(400/720) = 5.0 3 0
  • 31.
    3 1 Selection bias • Differentselection biases can end up canceling each other out theoretically, however in practice it’s impossible to know whether the biases cancel • Biases cancelled or equalized if magnitude of bias in selection of cases same as in selection of controls
  • 32.
    Selection bias • Samplingfraction differential by exposure for cases and controls – same magnitude • OR? • OR = (500/180)/(400/576) = 4.0 3 2
  • 33.
    Selection bias • Selectionfractions cancel in OR calculation • OR = [(a*1)/(β*0.1)]/[(c*0.8)/(δ*0.08)] • OR = [10(a/β)]/[10(c/δ)] • OR = (a/β)/(c/δ) 3 3
  • 34.
    3 4 Selection bias • Ifselection/participation/loss is differential only with respect to disease – Biased: • • • Measures of disease AR, CIR, IDR – Unbiased: OR • If selection/participation/loss is differential with respect to exposure and disease, all measures of disease and association will be biased – Excepting the rare situations in which the biases cancel
  • 35.
    3 5 Selection bias • Selectionbiases types summary – Self selection bias – Case-control studies • Berkson’s bias • People with particular exposure/disease combinations more or less likely to participate – Cohort studies • Loss to follow-up • Healthy worker effect • People with particular exposure/(future)disease combinations more or less likely to participate or to be lost to follow-up
  • 36.
    120 • A case-controlstudy of postmenopausal hormone therapy and endometrial cancer • Cases and controls are recruited from same medical practice • Women with endometrial cancer are more likely to have symptoms that lead them to visit doctor • Women taking postmenopausal hormone therapy are more likely to have symptoms that lead them to visit doctor • Women are enrolled in the study from the medical practice Selection bias
  • 37.
    • Selection factoris visiting the medical practice where study recruitment is being carried out • Alters any true association between HRT and endometrial cancer HRT Endometrial cancer Visit medical practice, recruited for study Selection bias 3 7