This document provides an overview of bias, confounding, and interaction in epidemiological studies. It defines bias as systematic error that results in a mistaken estimate of an exposure's effect. Two types of bias discussed are selection bias, from non-random selection of study groups, and information bias, from incorrect measurement or classification of data. Confounding occurs when a third factor is associated with both the exposure and outcome independently. The document outlines methods to address confounding including stratification and Cochran-Mantel-Haenszel calculations to provide adjusted estimates of risk or odds ratios.
2. Disclaimer
This presentation is based on Chapter 15 of the book ‘Epidemiology- sixth edition’ by
Leon Gordis. Information extracted from any other sources are acknowledged in the
respective slides accordingly.
2
3. Learning objectives
● To review some possible biases in epidemiologic studies, including selection
bias and information bias.
● To define confounding and to discuss possible ways to deal with
confounding in the design and/or analysis of an observational
(nonrandomized) study.
● To define interaction and to present a framework for detecting whether (and
to what extent) two factors interact to influence the risk of a disease
3
4. Bias
Bias has been defined as “any systematic error in the design, conduct or
analysis of a study that results in a mistaken estimate of an exposure’s effect on
the risk of disease.”
Types of Bias:
● Selection Bias
● Information Bias
4
5. Selection Bias
● The way in which cases and controls, or exposed and non exposed individuals,
were selected is such that an apparent association is observed—even if, in reality,
exposure and disease are not associated the apparent association is the result of
selection bias.
● Error in selecting a study group or groups within the study and can have a major
impact on the internal validity of the study and the legitimacy of the conclusion
● One form that selection bias can take results from nonresponse of potential study
subjects.
● If no information is obtained from the nonresponders, nonresponse may introduce a
bias that may be difficult to assess.
5
7. Surveillance bias
● May occur when subjects in exposure group are more likely to have the study
outcome detected because they receive increased surveillance, screening or
testing as a result of having some other medical condition for which they are
being followed
● For example, obese patients are more likely to undergo medical examinations,
blood tests, and imaging studies than non-obese people. If obese subjects were
being compared to non-obese subjects for risk of certain cancers, early cancers
would be more likely to be found in the obese group, causing an overestimate of
the true association
7
8. Neyman bias
Bias due to missing of fatal cases, mild case or case of shorter duration
There are two ways in which this bias can affect the results of a study:
1. If extremely sick individuals are excluded from the study because they’ve died,
then the disease will appear less severe.
2. If extremely healthy individuals are excluded from the study because they have
recovered and been sent home, then the disease will appear more severe.
8
1. Zach. Neyman Bias: Definition & Examples [Internet]. Statology. 2020 [cited 2023 Jun 17]. Available from:
https://www.statology.org/neyman-bias/
9. Referral bias/Volunteer bias
● Volunteer bias occur because people who volunteer to participate in study
are often different than non-volunteer.
● This bias usually favours the treatment group as volunteer tends to be
more motivated and concerned about their health.
● Volunteers tend to be more educated, come from high social class and
more approval motivated.
9
1. Catalogue of Bias Collaboration, Brassey J, Mahtani KR, Spencer EA, Heneghan C. Volunteer bias. Catalogue Of
Bias 2017: http://www.catalogofbias.org/biases/volunteer-bias
10. Attrition bias
● Attrition bias is a type of selection bias due to systematic differences between
study groups in the number and the way participants are lost from a study
● Attrition can introduce systematic bias if the characteristics of study participants
who are lost to follow-up differ between the randomized treatment groups or
observational study cohorts
● Over-recruitment can help prevent important attrition bias.
10
1. Nunan D, Aronson JK, Bankhead C. Catalogue of bias: attrition bias. 2018 Jan 24 [cited 2023 Jun 17];23(1):21–2. Available from:
https://pubmed.ncbi.nlm.nih.gov/29367321/
2. Vetter T, Mascha EJ. Bias, Confounding, and Interaction. 2017 Sep 1 [cited 2023 Jun 17];125(3):1042–8. Available from:
https://pubmed.ncbi.nlm.nih.gov/28817531/
11. Berksonian bias
● Also known as admission rate bias
● Berksonian bias result from greater probability of hospital admission from two or
more disease than one disease.
● Mostly occur in hospital based case control study
● For prevention from berksonian bias, collect a simple random sample from a population.
11
1. Westreich D. Berkson’s Bias, Selection Bias, and Missing Data. 2012 Jan 1 [cited 2023 Jun 17];23(1):159–64. Available from:
https://journals.lww.com/epidem/Fulltext/2012/01000/Berkson_s_Bias,_Selection_Bias,_and_Missing_Data.24.aspx
2. Perrine Juillion. What is Berksonian bias? - Studybuff.com [Internet]. Studybuff.com. 2020 [cited 2023 Jun 17]. Available from:
https://studybuff.com/what-is-berksonian-bias/
12. Information Bias
● Information bias is a type of error that occurs when key study variables are
incorrectly measured or classified.
● Information bias can affect the findings of observational or experimental
studies due to systematic differences in how data is obtained from
various study groups.
● Affects the validity of health research
12
14. Response bias
● Response bias is a general term describing situations where people do not
answer questions truthfully for some reason.
● For example, if a respondent was asked how often they consume alcohol and
the options were: ‘frequently, sometimes and infrequently’, they’re more
likely to choose sometimes or infrequently so they’re perceived positively.
14
1. Kassiani Nikolopoulou. What Is Response Bias? | Definition & Examples [Internet]. Scribbr. 2022 [cited 2023 Jun 17]. Available from:
https://www.scribbr.com/research-bias/response-bias/
15. Recall bias
● Recall bias refers to systematic difference in the ability of participant groups
to accurately recall information.
● Type of information bias common in case-control studies where the cases (or
their families) are more likely to recall a prior exposure than the controls.
● For example: cases with multiple episodes of major depression as an adult may
be more likely to recall and report childhood abuse than controls with no history
of mental health problems.
15
1. Jacob ME, Ganguli M. Epidemiology for the clinical neurologist. 2016 Jan 1 [cited 2023 Jun 17];3–16. Available from:
https://www.sciencedirect.com/topics/neuroscience/recall-bias
16. Interviewer bias
If an interviewer has a preconceived notion about the hypothesis being tested, he or she
might consciously or unconsciously interview case subjects differently than control
subjects.
Interviewer bias is a form of information bias due to:
1. Lack of equal probing for exposure history between cases and controls (exposure
suspicion bias); or
2. Lack of equal measurement of health outcome status between exposed and unexposed
(diagnostic suspicion bias)
16
1. Sources of Systematic Error or Bias: Information Bias [Internet]. [cited 2023 Jun 17]. Available from: https://sph.unc.edu/wp-
content/uploads/sites/112/2015/07/nciph_ERIC14.pdf#:~:text=Interviewer%20bias%20is%20a%20form%20of%20information%20bias
2. Information Bias [Internet]. Bu.edu. 2021 [cited 2023 Jun 17]. Available from: https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-
QuantCore/PH717-Module10-Bias/PH717-Module10-Bias6.html
17. Solutions
1. "Blind" the interviewers if possible, i.e., don't tell them the research hypothesis or keep
them from knowing whether subjects are cases or controls (This is not always possible).
2. Use standardized questionnaires with close-ended, easy to understand questions and
response options.
3. For socially sensitive questions, e.g., alcohol & drug use or sexual behaviors, use a self-
administered questionnaire instead of an interviewer.
4. Train interviewers to adhere strictly to the question and answer format, with the same
degree of questioning for both cases and controls.
5. Verify the accuracy of data by examining pre-existing records (e.g., medical records or
employment records) or assessing biomarkers.
17
1. Information Bias [Internet]. Bu.edu. 2021 [cited 2023 Jun 17]. Available from: https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-
QuantCore/PH717-Module10-Bias/PH717-Module10-Bias6.html
18. Hawthorne bias
The Hawthorne effect refers to people’s tendency to behave differently when they become aware
that they are being observed. As a result, what is observed may not represent “normal” behavior,
threatening the validity of research.
Solutions:
● Invest in interpersonal relationships at the study site. Sustaining contact with participants over
time reduces participant reactivity and improves the quality of data collection.
● Give participants tasks unrelated to the purposes of the study. This can mask the research
objective from the participants. However, be sure to consider whether this is ethical to do.
18
1. Kassiani Nikolopoulou. What Is the Hawthorne Effect? | Definition & Examples [Internet]. Scribbr. 2022 [cited 2023 Jun 17].
Available from: https://www.scribbr.com/research-bias/hawthorne-
effect/#:~:text=However%2C%20there%20are%20a%20few%20things%20you%20can,tasks%20unrelated%20to%20the%2
0purposes%20of%20the%20study.
19. Misclassification bias
Misclassification (or classification error) happens
when a participant is placed into the wrong
population subgroup or category because of some
kind of observational or measurement error.
People might be placed into the wrong groups
because of:
● Incomplete medical records.
● Recording errors in records.
● Misinterpretation of records.
● Errors in records, like incorrect disease
codes, or patients completing questionnaires
incorrectly 19
20. Non-Differential Misclassification
● Non-differential misclassification means that the
percentage of errors is about equal in the two
groups being compared.
● The information is incorrect, but is the same across
groups.
● If there really is an association, non-differential
misclassification tends to make the groups appear
more similar than they really are, and it causes an
underestimate of the association, i.e., "bias toward
the null".
20
21. Differential Misclassification
● Differential misclassification occurs when data is more
accurate in one of the comparison groups.
● Depending on the circumstances, differential
misclassification can cause either an under-estimate or an
over-estimate of the association.
Eg.; Emphysema is diagnosed more frequently in smokers than
in non-smokers. However, smokers may visit the doctor more
often for other conditions (e.g. bronchitis) than non-smokers,
which means that a reason smokers could be diagnosed with
emphysema more often is simply because they go to the doctor
more often — not because they actually have higher odds of
getting the disease.
21
1. Information Bias [Internet]. Bu.edu.
2021 [cited 2023 Jun 17]. Available
from:
https://sphweb.bumc.bu.edu/otlt/MPH
-Modules/PH717-QuantCore/PH717-
Module10-Bias/PH717-Module10-
Bias6.html
22. Types of bias
● Subject bias: Error introduced study subject e.g.: Recall bias, Hawthorne bias
● Investigator bias: Error Introduced by investigator e.g.: Interviewer bias,
Selection bias
● Analyzer bias: error introduced by analyzer e.g.: data fabrication
22
23. Methods to eliminate bias
1. Blinding: Preventing from knowing the certain information about the study. Basic tool to prevent
conscious and subconscious bias in study
1. Randomization: The act of randomly assigning subjects in a study to different treatment groups.
It will help to eliminate selection or investigator bias
23
Type Description Minimize
Single blinding Study subjects are not aware of the treatment they receiving Subject bias
Double blinding Study subjects as well as investigator are not aware of the
treatment that study subject are receiving
Subject bias + Investigator bias
Triple blinding Study subjects, investigator as well as analyzer are not aware
of the treatment that study subject are receiving
Subject bias + Investigator
bias+ analyzer bias
24. Confounder
● Any factor that is associated with both exposure and outcome and has independent
effect in causation of outcome.
● Unequally distributed between the study and control groups
● Is associated with both exposure and outcome
● Has an independent effect in causation of outcome(thus risk factor itself)
● In a study of whether factor A is a cause of disease B, we say that a third factor, factor X,
is a confounder if the following are true:
1. Factor X is a known risk factor for disease B.
2. Factor X is associated with factor A, but is not a result of factor A. 24
25. Confounder contd…
1. Smoking is a known risk factor for
pancreatic cancer.
2. Smoking is associated with coffee
drinking, but is not a result of coffee
drinking.
So if an association is observed between
coffee drinking and cancer of the
pancreas, it may be (1) that coffee actually
causes cancer of the pancreas, or (2) that
the observed association of coffee
drinking and cancer of the pancreas may
be a result of confounding by cigarette
smoking
25
Smoking was a confounder, because although we
were interested in a possible relationship between
coffee drinking (factor A) and pancreatic cancer
(disease B), the following are true of smoking (factor
X):
26. The first question to ask in addressing this issue is whether age is related to being a case or a
control.
Is the observed relationship confounded by age?
26
27. 80% of the controls are younger than 40 years of age, compared with only 50% of the cases.
The second question is whether age is related to whether or not a person has been exposed.
27
28. 28
Is the association of exposure and disease causal, or could we be seeing an
association of exposure with disease only because there is an age difference between
cases and controls, and older age?
29. Hypothetical Example of Confounding in an Unmatched Case-Control Study:
IV. Calculations of Odds Ratios after Stratifying by Age
29
Age < 40 Years Age > 40 Years
Case Control Total Case Control Total
Exposed 5 8 13 Exposed 25 10 35
Not
exposed
45 72 117 Not
exposed
25 10 35
50 80 130 50 20 70
30. Cochran-Mantel-Haenszel Equations
● The Cochran-Mantel-Haenszel method is a technique that generates an estimate of
an association between an exposure and an outcome after adjusting for or taking
into account confounding.
● The method is used with a dichotomous outcome variable and a dichotomous risk
factor.
● To explore and adjust for confounding, we can use a stratified analysis
● One 2*2 table for each stratum (category) of the confounding variable.
● We can compute a weighted average of the estimates of the risk ratios or odds ratios
across the strata.
● The weighted average provides a measure of association that is adjusted for
confounding. 30
32. Hypothetical Example of Confounding in an Unmatched Case-Control Study:
IV. Calculations of Odds Ratios after Stratifying by Age
32
Age < 40 Years Age > 40 Years
Case Control Total Case Control Total
Exposed 5 8 13 Exposed 25 10 35
Not
exposed
45 72 117 Not
exposed
25 10 35
50 80 130 50 20 70
34. Controlling of confounder
1. Randomization: Used for both known and unknown confounder
2. Restriction: Limiting study to the people who have particular
characteristics
3. Matching: Mostly useful in case-control Study.
4. Statistical modeling: Used when many confounding variables exist
simultaneously
5. Stratification: Mostly used in large studies
34
35. Confounding: good or bad?
● Confounding is not an error in the study, but rather is a true phenomenon that
is identified in a study
● Bias is a result of an error in the way that the study has been carried out, but
confounding is a valid finding that describes the nature of the relationship
among several factors and the risk of disease.
● Failure to take confounding into account in interpreting the results of a study
is indeed an error in the conduct of the study and can bias the conclusions of
the study
35
36. Interaction Effect
● To this point, our discussion has generally assumed the presence of a single
causal factor in the etiology of a disease. Although this approach is useful for
discussion purposes, in real life, we rarely deal with single cause.
● Mostly, there are more than one factor involved in disease etiology.
36
37. 37
Screenshot from the book: Beaglehole R, Bonita R. Public health at the crossroads: achievements and prospects. 2004.
Thus, the question now is, “How do
multiple factors interact in causing a
disease?”
38. What do we mean by interaction?
According to Macmahon, there is interaction “When the incidence rate of disease
in the presence of two or more risk factors differs from the incidence rate
expected to result from their individual effects.”
38
40. Types of Interaction: Multiplicative effect
Perhaps a second exposure does not add to the effect of the first exposure but instead
multiplies the effect of the first exposure.
?
9 * 15 / 3 = 21 and 6 * 12 / 0 = 18
?
40
41. A real example of interaction Eg.1
Which model fits
here?
- Additive
- Multiplicative
- Neither
Answer: This actually
approximates to
multiplicative model.
41
42. Eg. 2 We can see
multiplicative
interaction in uranium
workers and additive
interaction in atomic
bomb survivors.
42
43. Eg 3.
●Although the RR value of
15.50 or even 4.46
exceeds the expected
values by multiplying the
risks from both factors,
there is interaction
present.
●The interaction here is
greater than would be
expected also called
synergism.
43
44. Implication of interaction- an example
● The finding of interaction or synergism
may also have practical policy
implications involving issues such as who
is responsible for a disease and who
should pay compensation to the victims.
● Asbestos manufacturers were litigated for
their association with cancer. In 1970s
they were awarded large amounts from
the courts.
● In 1998, the time came when legal actions
against tobacco companies were
increasing. Then,……
44
45. ● Then, some of the victims of asbestos exposure made a coalition with
asbestos manufacturers and they demanded compensations to be made
by the tobacco companies.
● Those who objected to this demand said that these claims are freeing
the asbestos manufacturers from their obligation to compensate and
that they were doing so only because it would be easier to receive
compensation from tobacco companies than asbestos manufacturers.
● The basis for all this was the synergistic relationship of these
exposures.
45
47. Confounding
A B
associated
On stratified analysis
Among all the strata if
we find;
A B
not associated
Among all the strata if
we find;
A B
associated
(We want to know if a 3rd variable,
X is confounder or not.)
On crude
analysis:
X is a confounder X is not a confounder
47
48. Effect modification
A B
associated
On stratified analysis
If we find different results of association i.e.;
A B
not associated
Among all the strata if
we find;
A B
associated
(We want to know if a 3rd variable,
X is an effect modifier or not.)
On crude
analysis:
X is an effect modifier X is not an effect modifier
A B
strongly associated
Strata 1:
Strata 2: But the strength of association is not
different
48
52. Some terms and their use in writing
● Confounding: a phenomenon
● Confounder: a confounding variable
● Interaction effect/Effect modification: a phenomenon
● Effect modifier: a third variable
52
53. There are in every place and epoch those who value the truth, who record the evidence faithfully. Future
generations are in their debt.
-CarlSagan
53
Thank You