2. Major Types of
Clinical Epidemiologic Research
Type of Research
Question
Descriptive/Causal Aim
Diagnostic research Descriptive
Predict the probability of presence of
target disease from clinical and non-
clinical profile
Prognostic research Descriptive
Predict the course of disease from
clinical an d non-clinical profile
Etiologic research Causal
Causally explain occurrence of target
disease from determinant
Intervention research Causal & Descriptive
(1) Causally explain the course of disease
as influenced by treatment
(2) Predict the course of disease given
treatment (options) and clinical and
non-clinical profile
3. Etiologic research
The research question:
⢠Is there a relation between a determinant
(risk factor) and a disease-outcome?
ď Research question for causal relation!
4. Etiologic research
Characteristics
⢠To demonstrate causality (cause-effect)
⢠Cause comes before effect
â Exposure or determinant occurs before the
disease-outcome occurs
⢠Determinant-outcome relation is not
explained by other factors
⢠Explanatory research
â versus descriptive research
5. Hillsâ Criteria
⢠Temporal relationship, where the
cause precedes the outcome
⢠Strong association (OR,RR)
⢠Dose-response relationship
⢠Biological plausibility
6. Etiologic research
What study design?
⢠Experimental
â Exposure or determinant assigned by
investigator
versus
⢠Observational
â Exposure or determinant not assigned by
investigator
This lecture: observational research
7. Etiologic research
What study design?
Design of two observational studies to
distinguish between cause and effect:
1. Cohort study
2. Case-control study
8. Cohort study
⢠Also called follow-up study
⢠Definition
â Study in which persons, based on their exposure
or determinant, and free of the disease
outcome at the start of the study, are followed in
time to assess the occurrence of the disease
outcome.
9. Cohort study
time
start study disease-
outcome
determinant +
determinant -
disease +
disease -
disease +
disease -
cohort
without
disease
outcome
10. Framingham Heart Study
⢠1948 â Framingham, MA
⢠5200 persons 30-62 years old
⢠Aim: identification of risk factors for
cardiovascular diseases
⢠Remeasured every 2 years
Example of a research question:
Is hypertension a risk factor for MI?
12. Cohort study
determinant-outcome relation
MI + MI -
hypertension +
hypertension -
a
c
b
d
a/a+b=probability of MI for
hypertension + = Incidence+
relative risk = incidence + / incidence -
c/c+d=probability of MI for
hypertension - = Incidence -
14. Cohort study
How do you get a cohort?
⢠Geographical data (Framingham Heart Study)
⢠Birth cohort (British 1946 birth cohort)
⢠Occupational cohort (Whitehall study)
15. Cohort study
How do you follow the cohort?
How do you find the disease-outcome?
16. Cohort study
How do you follow the cohort?
How do you find the disease-outcome?
⢠After a certain time interval, send out a
questionnaire or invite for interview or
medical examination
⢠Record disease outcomes via medical files
or registrations
18. Case-control study
⢠Also called patient-control study
⢠Definition
â Study in which patients with the disease-outcome
and a control group without the disease-outcome
are selected and in which it is determined how many
people in both groups have been exposed to the
determinant
21. Creutzfeldt-Jakobâs Disease
⢠Fast, progressive form of
dementia
⢠In the 90s a new variant of
Creutzfeldt-Jakob was
discovered in Europe after
an epidemic of mad-cow
disease
⢠Caused by eating beef?
What research question?
Why case control?
25. Case-control study
How do you find patients?
⢠GP; hospital; cancer registration
How to select a control group?
⢠GP; hospital; general population
ď Patients and controls have to come
from the same âsourceâ population.
26. Selection of Cases
ďˇ Ideally, investigator identifies & enrolls all incident
cases in a defined population in a specified time period
ďˇ Select cases from registries or hospitals, clinics
ďˇ When all incident cases in a population are included,
the study is representative; otherwise there is potential
for bias (e.g. referral bias)
ďˇ Use of prevalent vs incident cases
27. Essence case-control studies
1. Detection of cases
2. Sampling of controls
3. Asses exposure in cases and controls
4. Calculate measure of association
(usually, etiology: odds ratio with 95% CI)
NOTE
Study of cases and controls instead of census
(census: entire population, as in cohort studies and
RCT)
31. Validity and bias
⢠Validity:
â absence of systematic errors (free from bias) in
design, conduct or data-analysis of the research
⢠Bias:
â degree of disruption of the determinantâoutcome
relation caused by systematic errors â leads to
reduced validity
⢠3 types of bias in etiologic research:
â selection bias, information bias, confounding
32. Any trend in the collection, analysis, interpretation,
publication or review of data that can lead to
conclusions that are systematically different from
the truth (Last, 2001)
A process at any state of inference tending to
produce results that depart systematically from
the true values (Fletcher et al, 1988)
Systematic error in design or conduct of a study
(Szklo et al, 2000)
What is Bias?
33. 1. Selection bias
definition
⢠Distortion of the determinant-outcome relation
caused by systematic errors in the selection
of study participants (cases and/or controls)
34. Selection Bias
Selective differences between comparison groups
that impacts on relationship between exposure
and outcome
Usually results from comparative groups not
coming from the same study base and not being
representative of the populations they come from
35. Selection bias
example 1
Patients: women with DVT admitted to hospital.
Controls: healthy women between 25-45 years old
Patients turned out to use oral anticonception more often.
Oral anticonception should be the cause of DVT.
How could selection bias play a role here?
Oral anticonception and probability of DVT ?
36. Selection bias
example 1
⢠Medical circuit: 'oral anticonception could lead to
DVTâ
⢠Women with DVT complaints, who use oral
anticonception, will be more often referred than those
that do not use oral anticonception
⢠Because of this selective referral all oral
anticonception users will have a higher probability to
come into the study as a case and the effect of oral
anticonception on DVT will be overestimated
37. Selection bias
example 2
⢠Patients from hospital â control group from
hospital:
â In the hospital co-morbidity and unhealthy lifestyles
occur more often than in the population
â Relation between smoking and cancer can be
underestimated due to over-representation of
controls who smoke
38. 2. Information bias
definition
⢠Distortion of the determinant-outcome relation
caused by systematic errors in the
measurement of the determinant and/or
outcome.
⢠Who knows an example?
39. Information / Measurement /
Misclassification Bias
Sources of information bias:
Subject variation
Observer variation
Deficiency of tools
Technical errors in measurement
40. Information bias
examples
⢠Misclassification of determinant
â Self reporting more accurate for cases than
controls (or the other way around)
⢠Misclassification of outcome
â Disease better diagnosed in people with
determinant
⢠In what cases can this play a role?
⢠Can this also play a role in cohort research?
41. Information / Measurement /
Misclassification Bias
Reporting bias:
Individuals with severe disease tends to have
complete records, therefore more complete
information about exposures and greater association
found
Individuals who are aware of being participants of a
study behave differently (Hawthorne effect)
42. Controlling for Information Bias
- Blinding
prevents investigators and interviewers from
knowing case/control or exposed/non-exposed
status of a given participant
- Form of survey
mail may impose less âwhite coat tensionâ than a
phone or face-to-face interview
- Questionnaire
use multiple questions that ask same information
acts as a built in double-check
- Accuracy
multiple checks in medical records
gathering diagnosis data from multiple sources
43. 3. Confounding
definition
⢠Determinant â disease outcome relation is
disturbed by the effect of another factor (the
confounder) (âmixing of effectsâ)
⢠Can you think of an example?
45. Confounding
determinant
(birth order)
disease outcome
(Down sydrome)
Confounder
(age mother)
1. Confounder is determinant of the disease outcome
2. Confounder is associated with the determinant
3. Confounder is no factor in the causal chain
46. Birth Order Down Syndrome
Maternal Age
Confounding
Maternal age is correlated with birth
order and a risk factor even if birth order
is low
52. ⢠A third factor which is related to both
exposure and outcome, and which accounts
for some/all of the observed relationship
between the two
⢠Confounder not a result of the exposure
â e.g., association between childâs birth rank
(exposure) and Down syndrome (outcome);
motherâs age a confounder?
â e.g., association between motherâs age (exposure)
and Down syndrome (outcome); birth rank a
confounder?
Confounding
53. Confounding
Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking
with CHD in another sample. Would you be able to replicate it?
If not why?
Imagine you have included only non-smokers in a study and
examined association of alcohol with lung cancer. Would you
find an association?
Imagine you have stratified your dataset for smoking status in
the alcohol - lung cancer association study. Would the odds
ratios differ in the two strata?
Imagine you have tried to adjust your alcohol association for
smoking status (in a statistical model). Would you see an
association?
54. Confounding
Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking
with CHD in another sample. Would you be able to replicate it?
If not why?
You would not necessarily be able to replicate the
original finding because it was a spurious association
due to confounding.
In another sample where all mothers are below 30 yr,
there would be no association with birth order.
In another sample in which there are few smokers,
the coffee association with CHD would not be
replicated.
55. Confounding
Imagine you have included only non-smokers in a study and
examined association of alcohol with lung cancer. Would you
find an association?
No, because the first study was confounded. The
association with alcohol was actually due to smoking.
By restricting the study to non-smokers, we have
found the truth. Restriction is one way of preventing
confounding at the time of study design.
56. Confounding
If the smoking is included in the statistical model, the
alcohol association would lose its statistical
significance. Adjustment by multivariable modelling is
another method to identify confounders at the time of
data analysis.
Imagine you have tried to adjust your alcohol association for
smoking status (in a statistical model). Would you see an
association?
57. Confounding
For confounding to occur, the confounders should be
differentially represented in the comparison groups.
Randomisation is an attempt to evenly distribute
potential (unknown) confounders in study groups. It
does not guarantee control of confounding.
Matching is another way of achieving the same. It
ensures equal representation of subjects with known
confounders in study groups. It has to be coupled with
matched analysis.
Restriction for potential confounders in design also
prevents confounding but causes loss of statistical
power (instead stratified analysis may be tried).
58. Controlling confounding
In the design
⢠Restriction of the
study
⢠Matching
In the analysis
⢠Restriction of the
analysis
⢠Stratification
⢠Multivariable
methods
59. How to prevent bias?
⢠Confounding â cannot be prevented
â Measure and adjust in data analysis
⢠Information bias - prevent during design
â Disease status blind for determinant status
â Medical files instead of self-reporting
â Same way of reporting for cases and controls
⢠Selection bias - prevent during design
â Control selection independent of determinant
status
â Good definition of source population
60. Cohort study
Advantages and disadvantages
⢠What are the advantages of a cohort study?
⢠What are the disadvantages of a cohort study?
61. Cohort study
⢠Advantages
â Cause is measured before effect
â Not very sensitive to selection- and
information bias
â Appropriate for rare determinant
â Can study several outcomes
⢠Disadvantages
â Selective withdrawal / loss to follow-up
â Expensive and time consuming
â Not appropriate for rare outcome
62. Case-control study
Advantages and disadvantages
⢠What are the advantages of a case-control
study?
⢠What are the disadvantages of a case-control
study?
63. Case-control study
⢠Advantages
â Efficient and relatively cheap
â Appropriate for rare outcome
â Can study several determinants
⢠Disadvantages
â Cause is measured after effect
â Very sensitive to selection- and infobias
â Not appropriate to study several outcomes
64.
65. Effect modification
⢠Definition: The association between
exposure and disease differ in strata of
the population
â Example: Tetracycline discolours teeth in
children, but not in adults
â Example: Measles vaccine protects in
children > 15 months, but not in children <
15 months
⢠Rare occurence
71. Selection Bias Examples
Case-control study:
Controls have less potential for exposure than cases
Outcome = brain tumour; exposure = overhead
high voltage power lines
Cases chosen from province wide cancer registry
Controls chosen from rural areas
Systematic differences between cases and controls
73. Selection Bias Examples
Cohort study:
Differential loss to follow-up
Especially problematic in cohort studies
Subjects in follow-up study of multiple sclerosis may
differentially drop out due to disease severity
Differential attrition ď selection bias
74. Selection Bias Examples
Self-selection bias:
- You want to determine the prevalence of HIV infection
- You ask for volunteers for testing
- You find no HIV
- Is it correct to conclude that there is no HIV in this
location?
75. Selection Bias Examples
Healthy worker effect:
Another form of self-selection bias
âself-screeningâ process â people who are unhealthy
âscreenâ themselves out of active worker population
Example:
- Course of recovery from low back injuries in 25-45 year
olds
- Data captured on workerâs compensation records
- But prior to identifying subjects for study, self-selection
has already taken place
76. Information / Measurement /
Misclassification Bias
Method of gathering information is inappropriate and
yields systematic errors in measurement of exposures
or outcomes
If misclassification of exposure (or disease) is
unrelated to disease (or exposure) then the
misclassification is non-differential
If misclassification of exposure (or disease) is related
to disease (or exposure) then the misclassification is
differential
Distorts the true strength of association
77. Information / Measurement /
Misclassification Bias
Recall bias:
Those exposed have a greater sensitivity for recalling
exposure (reduced specificity)
- specifically important in case-control studies
- when exposure history is obtained retrospectively
cases may more closely scrutinize their past history
looking for ways to explain their illness
- controls, not feeling a burden of disease, may less
closely examine their past history
Those who develop a cold are more likely to identify
the exposure than those who do not â differential
misclassification
- Case: Yes, I was sneezed on
- Control: No, canât remember any sneezing
78. Exposure Outcome
Third variable
To be a confounding factor, two conditions must be met:
Be associated with exposure
- without being the consequence of exposure
Be associated with outcome
- independently of exposure (not an intermediary)
Confounding
79. Birth Order
Down Syndrome
Maternal Age
Confounding ?
Birth order is correlated with maternal age
but not a risk factor in younger mothers
81. Obesity Mastitis
Age
Confounding
In cows, older ones are heavier and older
age increases the risk for mastitis. This
association may appear as an obesity
association
82. Confounding
(www)
If each case is matched with a same-age control, there will be no
association (OR for old age = 2.6, P = 0.0001)
84. Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Can sex be responsible for the birth weight
association in leukaemia?
- Is it correlated with birth weight?
- Is it correlated with leukaemia independently of
birth weight?
- Is it on the causal pathway?
- Can it be associated with leukaemia even if birth
weight is low?
- Is sex distribution uneven in comparison groups?
85. Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Does birth weight association differ in strength according to sex?
Birth Weight Leukaemia
Birth Weight Leukaemia
/ /
BOYS
GIRLS
OR = 1.8
OR = 0.9
OR = 1.5
86. Effect Modification
In an association study, if the strength of the
association varies over different categories of a third
variable, this is called effect modification. The third
variable is changing the effect of the exposure.
The effect modifier may be sex, age, an environmental
exposure or a genetic effect.
Effect modification is similar to interaction in statistics.
There is no adjustment for effect modification. Once it
is detected, stratified analysis can be used to obtain
stratum-specific odds ratios.
87. Effect modifier
Belongs to nature
Different effects in different strata
Simple
Useful
Increases knowledge of biological mechanism
Allows targeting of public health action
Confounding factor
Belongs to study
Adjusted OR/RR different from crude OR/RR
Distortion of effect
Creates confusion in data
Prevent (design)
Control (analysis)
88. Modification-1
⢠Present when the measure of association between a given
determinant and outcome is not constant across a subject
characteristics
⢠Descriptive modification may easily occur due to differences in
prevalence of the disease across populationsor population
subgroups
⢠The presence or absence of modification has a bearing on the
domain and the generalizability of research findings
⢠Modifiers point to subdomains, which implies that generalizing
results from a study should be different for populations with or
without the (particular level of the) modifier
89. Modification-2
⢠In etiologic research, analysis of modifiers may help
the investigator to understand the complexity of
multicausality and causally explain why a particular
disease may be more common in certain individuals
despite an apparent similar exposure to determinant
90. Statistical Interaction
⢠Definition
â when the magnitude of a measure of association (between
exposure and disease) meaningfully differs according to the value
of some third variable
⢠Synonyms
â Effect modification
â Effect-measure modification
â Heterogeneity of effect
⢠Proper terminology
â e.g. Smoking, caffeine use, and delayed conception
⢠Caffeine use modifies the effect of smoking on the risk for
delayed conception.
⢠There is interaction between caffeine use and smoking in the
risk for delayed conception.
⢠Caffeine is an effect modifier in the relationship between
smoking and delayed conception.
91. No Multiplicative Interaction
0.05
0.15
0.15
0.45
0.01
0.1
1
10
Unexposed Exposed
Risk
of
Disease Third Variable Present
Third Variable Absent
Multiplicative Interaction
0.05
0.15
0.08
0.9
0.01
0.1
1
10
Unexposed Exposed
Risk
of
Disease
Third Variable Present
Third Variable Absent
RR = 3.0
RR = 3.0
RR = 3.0
RR = 11.2
93. Interaction is likely everywhere
⢠Susceptibility to infectious diseases
â e.g.,
⢠exposure: sexual activity
⢠disease: HIV infection
⢠effect modifier: chemokine receptor phenotype
⢠Susceptibility to non-infectious diseases
â e.g.,
⢠exposure: smoking
⢠disease: lung cancer
⢠effect modifier: genetic susceptibility to smoke
⢠Susceptibility to drugs (efficacy and side effects)
⢠effect modifier: genetic susceptibility to drug
⢠But in practice to date, difficult to document
â Genomics may change this
94. Additive vs Multiplicative Interaction
⢠Assessment of whether interaction is present depends upon the
measure of association
â ratio measure (multiplicative interaction) or difference measure
(additive interaction)
â Hence, the term effect-measure modification
⢠Absence of multiplicative interaction typically implies presence of additive
interaction
0.05
0.15
0.15
0.45
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
Additive
interaction
present
Multiplicative
interaction absent
RR = 3.0 RD = 0.3
RR = 3.0 RD = 0.1
95. Additive vs Multiplicative Interaction
⢠Absence of additive interaction typically implies presence of
multiplicative interaction
0.05
0.15
0.15
0.25
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
Multiplicative
interaction present
Additive
interaction absent
RR = 3.0 RD = 0.1
RR = 1.7 RD = 0.1
96. Additive vs Multiplicative Interaction
⢠Presence of multiplicative interaction may or may not be
accompanied by additive interaction
0.1
0.2
0.2
0.6
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
0.1
0.2
0.05
0.15
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
Additive
interaction
present
No additive
interaction
RR = 2.0 RD = 0.1
RR = 2.0 RD = 0.1
RR = 3.0 RD = 0.4
RR = 3.0 RD = 0.1
97. Additive vs Multiplicative Interaction
⢠Presence of additive interaction may or may not be accompanied by
multiplicative interaction
0.1
0.2
0.2
0.6
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
0.1
0.3
0.05
0.15
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
Multiplicative
interaction absent
Multiplicative
interaction present
RR = 3.0 RD = 0.1
RR = 3.0 RD = 0.4
RR = 2.0 RD = 0.1
RR = 3.0 RD = 0.2
98. Additive vs Multiplicative Interaction
⢠Presence of qualitative multiplicative interaction is always accompanied by
qualitative additive interaction
Qualitative Interaction
0.18
0.13
0.08
0.2
0.01
0.1
1
Unexposed Exposed
Risk
of
Disease
Third Variable Present
Third Variable Absent
Multiplicative and
additive
interaction both
present
99. Additive vs Multiplicative Scales
⢠Additive measures (e.g., risk difference):
â readily translated into impact of an exposure (or intervention) in
terms of number of outcomes prevented
⢠e.g. 1/risk difference = no. needed to treat to prevent (or avert)
one case of disease
â or no. of exposed persons one needs to take the exposure
away from to avert one case of disease
â gives âpublic health impactâ of the exposure
⢠Multiplicative measures (e.g., risk ratio)
â favored measure when looking for causal association (etiologic
research)
100. Additive vs Multiplicative Scales
⢠Causally related but minor public health importance
⢠- Risk ratio = 2
â Risk difference = 0.0001 - 0.00005 = 0.00005
â Need to eliminate exposure in 20,000 persons to avert one
case of disease
⢠Causally related and major public health importance
â RR = 2
â RD = 0.2 - 0.1 = 0.1
â Need to eliminate exposure in 10 persons to avert one case
of disease
Disease No Disease
Exposed 10 99990
Unexposed 5 99995
Disease No Disease
Exposed 20 80
Unexposed 10 90
101. Smoking, Family History
and Cancer:
Additive vs Multiplicative Interaction
Cancer No Cancer
Smoking 50 150
No Smoking 25 175
Cancer
No
Cancer
Smoking 10 90
No Smoking 5 95
Stratified
Crude
Family History
Absent
Family History
Present
Risk rationo family history = 2.0
RDno family history = 0.05
Cancer
No
Cancer
Smoking 40 60
No Smoking 20 80
Risk ratiofamily history = 2.0
RDfamily history = 0.20
⢠No multiplicative interaction but presence of additive interaction
⢠If etiology is goal, risk ratioâs may be sufficient
⢠If goal is to define sub-groups of persons to target:
Rather than ignoring, it is worth reporting that only 5 persons with a family history have to be
prevented from smoking to avert one case of cancer
102. Confounding vs Interaction
⢠Confounding
â An extraneous or nuisance pathway that an investigator
hopes to prevent or rule out
⢠Interaction
â A more detailed description of the relationship between the
exposure and disease
â A richer description of the biologic or behavioral system
under study
â A finding to be reported, not a bias to be eliminated