Association and Causation
Ravi M R
Dept. of Community Medicine
JSS Medical College
Moderator- Dr Prakash B
JSS Medical College
Epidemiological principles stand on two basic assumptions
Human disease does not occur at random
The disease and its causal as well as preventive factors can be
identified by a thorough investigation of population
• Identification of causal relationship between a disease and
suspected risk factors forms part of epidemiological research.
• It endeavours to suggest an etiological hypothesis.
Analytical and experimental studies
• Test the hypotheses derived from descriptive studies.
If two attributes say A and B are found to co-exit more often
than an ordinary chance.
It is useful to consider the concept of correlation.
Correlation indicates the degree of association between two
• Correlation co-efficent ranges from -1.0 to + 1.0.
• Value of 1.0 means that the two variables exhibit a perfect
• Causation implies correlation, but correlation doesnot imply
• If one of these attributes say A is the suspected cause and the
other say B is a disease then we have a reason to suspect that A
has caused B
Environmental exposure or
Disease or other health outcome
APPROACHES FOR STUDYING DISEASE
Randomized control studies
Case control studies
• Conceptually, a two-step process is followed in carrying out
studies and evaluating evidence.
• 1. We determine whether there is an association between an
exposure or characteristic and the risk of a disease. To do
so, we use:
• a. Studies of group characteristics: ecologic studies
• b. Studies of individual characteristics: case-control and
• 2. If an association is demonstrated, we determine whether
the observed association is likely to be a causal one
• Ecologic Studies
• The first approach in determining whether an association
exists might be to conduct studies of group characteristics,
called ecologic studies
• Eg.relationship between breast cancer incidence and average
dietary fat consumption in each country
• Recognizing the limitations discussed above of ecologic
studies that use only group data, we turn next to studies of
individual characteristics: case-control and cohort studies.
• In case-control or cohort studies, for each subject we have
information on both exposure (whether or not and, often, how
much exposure occurred) and disease outcome (whether or not
the person developed the disease in question). In ecologic
studies, we only have data on groups
Types Of Association
1. Spurious association
2. Indirect association
3. Direct association
One-to-one causal association
This is an association which appears due to improper
Observed association between a disease and suspected factor
may not be real.
E.g.; Neonatal mortality was observed to be more in the
newborns born in a hospital than those born at home. This is
likely to lead to a conclusion that home delivery is better for
the health of newborn.
However, this conclusion was not drawn in the study
because the proportion of “high risk” deliveries was found to
be higher in the hospital than in home.
It is a statistical association between a characteristic of interest
and a disease due to the presence of another factor i.e.
common factor (confounding variable).
E.g.: Neonatal mortality (A) was found to be associated with
maternal age above 30 years (B) and with birth order 4 and
It was also shown that the attribute B and C were
associated with each other.
The association between the two attributes is not through the
When the study reveals it is not a spurious association.
When the disease is present, the factor must also be present.
Direct Association Is Classified Into Two Types
One-to-one Casual Relationship:
The variables are stated to be casual related (AB) if a change
in A is followed by a change in B.
When the disease is present, the factor must also be present.
A single factor or cause may lead to more than one outcome.
Alternative causal factors each acting independently.
E.g. In lung cancer more than one factor (e.g. air
pollution, smoking, heredity) can produce the disease
Measures of association
measures of association in which relative differences
between groups being compared
Difference measures are measures of association in
which absolute differences between groups being compared .
• Main goal is often an absolute reduction in the risk of an
• When outcome of interest is continuous, the assessment of
mean absolute differences between exposed and unexposed
individuals may be an appropriate method for the
determination of association
• Can be assessed for discrete outcomes.
• To assess causal associations
Type Example Usual application
Absolute difference AR (Attributable Risk) Primary prevention impact:
search for causes.
Primary prevention impact
Efficacy Impact of intervention on
recurrences, case fatality
Search for determinants
Relative difference Relative risk/rate Search for causes
Search for causes
Incidence among exposed
Relative risk = ________________
Incidence among unexposed
It is direct measure of the strength of association.
RR = 1 No association
RR > 1 Positive association
RR < 1 Negative association
• Odd‟s ratio(Relative odds)
Cases(with disease) Controls
H/O of expos a b
No H/O expos c d
product of the two cells
that support the
hypothesis of an
product of the two cells
that negate the
hypothesis of an
ATTRIBUTABLE RISK (AR)
• How much of the disease that occurs can be attributed to a
• It is often used to imply a cause-effect relationship and should
be interpreted as a true etiologic fraction only when there is a
reasonable certainty of a causal connection between exposure
• When causality has not been firmly established then the AR is
termed as excess fraction
POPULATION ATTRIBUTABLE RISK
• What proportion of the disease incidence in a total population
can be attributed to a specific exposure?
• To know the PAR , we need to know
• incidence in total population =a
• incidence in unexposed group(background risk)=b
• PAR= a-b ÷ a
What is cause
The word cause is the one in general usage in connection with
matters considered in this study, and it is capable of conveying
the notion of a significant, effectual relationship between an
agent and an associated disorder or disease in the host.”
1964 Surgeon General Report
If a relationship is causal, four types of causal relationships are
(1) necessary and sufficient
(2) necessary, but not sufficient
(3) sufficient, but not necessary
(4) neither sufficient nor necessary
Necessary and Sufficient
• a factor is both necessary and sufficient for producing the
• Without that factor, the disease never develops (the factor is
necessary), and in the presence of that factor, the disease
always develops (the factor is sufficient)
Necessary, But Not Sufficient
• Each factor is necessary, but not, in itself, sufficient to cause
• Thus, multiple factors are required, often in a specific temporal
• For example, carcinogenesis is considered to be a multistage
process involving both initiation and promotion. For cancer to
result, a promoter must act after an initiator has acted. Action
of an initiator or a promoter alone will not produce a cancer.
Types of causal relationships: II. Each factor is
necessary, but not sufficient
Sufficient But Not Necessary
• the factor alone can produce the disease, but so can other factors that
are acting alone
• Thus, either radiation exposure or benzene exposure can each
produce leukemia without the presence of the other.
• Even in this situation, however, cancer does not develop in everyone
who has experienced radiation or benzene exposure, so although
both factors are not needed, other cofactors probably are. Thus, the
criterion of sufficient is rarely met by a single factor
Types of causal relationships: III. Each factor is
sufficient, but not necessary
Neither Sufficient Nor Necessary
• a factor, by itself, is neither sufficient nor necessary to produce
• This is a more complex model, which probably most
accurately represents the causal relationships that operate in
most chronic diseases
Types of causal relationships: IV. Each factor is neither sufficient
EVIDENCE FOR A CAUSAL RELATIONSHIP
• In 1840, Henle proposed postulates for causation that were
expanded by Koch in the 1880s. The postulates for causation
were as follows:
• 1. The organism is always found with the disease.
• 2. The organism is not found with any other disease.
• 3. The organism, isolated from one who has the disease, and
cultured through several generations, produces the disease (in
• Koch added that “Even when an infectious disease cannot be
transmitted to animals, the „regular‟ and „exclusive‟ presence
of the organism [postulates 1 and 2] proves a causal
• These postulates, though not perfect, proved very useful for
• However, as apparently noninfectious diseases assumed
increasing importance toward the middle of the 20th century,
• the issue arose as to what would represent strong evidence of
causation in diseases that were generally not of infectious
• the U.S. Surgeon General appointed an expert committee to
review the evidence. The committee developed a set of
guidelines, which have been revised over the years.
Guidelines for Judging Whether an Association Is
Causal (Leon Gordis)
• 1. Temporal relationship
• 2. Strength of the association
• 3. Dose-response relationship
• 4. Replication of the findings
• 5. Biologic plausibility
• 6. Consideration of alternate explanations
• 7. Cessation of exposure
• 8. Consistency with other knowledge
• 9. Specificity of the association
The causal attribute must precede the disease or unfavorable
Exposure to the factor must have occurred before the disease
Length of interval between exposure and disease very
If the disease develops in a period of time too soon after
exposure, the causal relationship is called into question
Strength Of The Association
Relationship between cause and outcome could be strong or
With increasing level of exposure to the risk factor an increase
in incidence of the disease is found.
There are statistical methods to quantify the strength of
association viz; calculation of relative risk, attributable risk
• As the dose of exposure increases, the risk of disease also
• If a dose-response relationship is present, it is strong evidence
for a causal relationship.
• However, the absence of a dose-response relationship does not
necessarily rule out a causal relationship.
• In some cases in which a threshold may exist, no disease may
develop up to a certain level of exposure (a threshold); above
this level, disease may develop
Replication of the Findings
• If the relationship is causal, we would expect to find it
consistently in different studies and in different populations
• Replication of findings is particularly important in
• If an association is observed, we would also expect it to be
seen consistently within subgroups of the population and in
different populations, unless there is a clear reason to expect
Biologic Plausibility Of The Association
The association must be consistent with the other knowledge
(viz mechanism of action, evidence from animal experiments
Sometimes the lack of plausibility may simply be due to the
lack of sufficient knowledge about the pathogenesis of a
Consideration of Alternate Explanations
• We have discussed the problem in interpreting an observed
association in regard to whether a relationship is causal or is
the result of confounding
• In judging whether a reported association is causal, the extent
to which the investigators have taken other possible
explanations into account and the extent to which they have
ruled out such explanations are important considerations.
Cessation of Exposure
• If a factor is a cause of a disease, we would expect the risk of
the disease to decline when exposure to the factor is reduced or
Consistency Of The Association
Consistency is the occurrence of the association at some other
time and place repeatedly.
If a relationship is causal, the findings should be consistent
with other data.
If lung cancer incidence increased as cigarette use was on the
decline, we would have to be able to explain how this was
consistent with a causal relationship.
If there is no consistency it will weaken a causal interpretation.
The causal association between smoking and lung cancer due
to its consistency.
Specificity Of The Association
The weakest of the criteria (should probably be eliminated)
Specific exposure is associated with only one disease.
This is used by tobacco companies to argue that smoking is not
causal in lung cancer.
Smoking is associated with many diseases.
Specificity implies a one to one relationship between the cause
DERIVING CAUSAL INFERENCES: EXAMPLE
• Until the 1980s, the major causes of peptic ulcer disease were
considered to be stress and lifestyle factors, including smoking
• In 1984, Australian physicians Drs. Barry J. Marshall and J.
Robin Warren reported that they had observed small curved
bacteria colonizing the lower part of the stomach in patients
with gastritis and peptic ulcers
• After several attempts, Marshall succeeded in cultivating a
hitherto unknown bacterial species (later named Helicobacter
pylori) from several of these biopsies
• Together they found that the organism was present in almost
all patients with gastric inflammation or peptic ulcer
• Many of these patients had biopsies performed which showed
evidence of inflammation present in the gastric mucosa close
to where the bacteria were seen
• Based on these results, they proposed that Helicobacter pylori
is involved in the etiology of these diseases. It was
subsequently shown that the ulcer was often not cured until
Helicobacter pylori had been eliminated.
Assessment of the Evidence Suggesting Helicobacter
pylori as a Causative Agent of Duodenal Ulcers
1. Temporal relationship.
• Helicobacter pylori is clearly linked to chronic gastritis.
About 11% of chronic gastritis patients will go on to have
duodenal ulcers over a 10-year period.
2. Strength of the relationship.
• Helicobacter pylori is found in at least 90% of patients with
In at least one population reported to lack duodenal ulcers, a
northern Australian aboriginal tribe that is isolated from other
people, it has never been found.
3. Dose-response relationship.
• Density of Helicobacter pylori per square millimeter of gastric
mucosa is higher in patients with duodenal ulcer than in patients
without duodenal ulcer
4. Replication of the findings.
• Many of the observations regarding Helicobacter pylori have been
5. Consideration of alternate explanations.
• Data suggest that smoking can increase the risk of duodenal ulcer in
Helicobacter pylori-infected patients but is not a risk factor in
patients in whom Helicobacter pylori has been eradicated
6. Biologic plausibility.
• Although originally it was difficult to envision a bacterium that
infects the stomach antrum causing ulcers in the duodenum, it is now
recognized that Helicobacter pylori has binding sites on antral cells
and can follow these cells into the duodenum.
• Helicobacter pylori also induces mediators of inflammation.
• Helicobacter pylori-infected mucosa is weakened and is susceptible
to the damaging effects of acid.
7. Cessation of exposure.
• Eradication of Helicobacter pylori heals duodenal ulcers at the same
rate as histamine receptor antagonists.
• Long-term ulcer recurrence rates were zero after Helicobacter pylori
was eradicated using triple-antimicrobial therapy, compared with a
60% to 80% relapse rate often found in patients with duodenal ulcers
treated with histamine receptor antagonists.
8. Specificity of the association.
• Prevalence of Helicobacter pylori in patients with duodenal ulcers in
90% to 100%. However, it is found in some patients with gastric ulcer
and even in asymptomatic individuals.
9. Consistency with other knowledge.
• Prevalence of Helicobacter pylori infection is the same in men as in
women. The incidence of duodenal ulcer, which in earlier years was
believed to be higher in men than in women, has been equal in recent
• The prevalence of ulcer disease is believed to have peaked in the
latter part of the 19th century, and the prevalence of Helicobacter
pylori may have been much higher at that time because of poor living
Modified Guidelines for Evaluating the Evidence of a
Causal Relationship. (In each category, studies are
listed in descending priority order.) 1990
1. Major criteria
a. Temporal relationship: An intervention can be considered
evidence of a reduction in risk of disease or abnormality only
if the intervention was applied before the time the disease or
abnormality would have developed.
b. Biological plausibility: A biologically plausible mechanism
should be able to explain why such a relationship would be
expected to occur.
Single studies are rarely definitive. Study findings that are
replicated in different populations and by different
investigators carry more weight than those that are not. If the
findings of studies are inconsistent, the inconsistency must be
d. Alternative explanations (confounding):
The extent to which alternative explanations have been
explored is an important criterion in judging causality
2. Other considerations
a. Dose-response relationship:
If a factor is indeed the cause of a disease, usually (but not invariably)
the greater the exposure to the factor, the greater the risk of the
disease. Such a dose-response relationship may not always be seen
because many important biologic relationships are dichotomous, and
reach a threshold level for observed effects.
b. Strength of the association:
The strength of the association is usually measured by the extent to
which the relative risk or odds depart from unity, either above 1 (in
the case of disease-causing exposures) or below 1 (in the case of
c. Cessation effects:
If an intervention has a beneficial effect, then the benefit should cease
when it is removed from a population (unless carryover effect is
• The lesson has reviewed some of the considerations necessary in
making causal inferences relevant to public health
• Results from epidemiological studies are often used as inputs for
policy and judicial decisions
eg., studies on link between smoking and lung cancer have led to
labelling of cigarette packs.
• It is thus important for public health and policy makers to understand
the fundamentals of causal inference.
• Association does not imply causation
• Apart from outbreak investigations, no single study is capable
of establishing a causal relation or fully informing either
individual or policy decisions.
• Those decisions should be based on a carefull consideration of
the entire relevant scientific and policy literature
•  Park K. Textbook of Preventive and Social Medicine. 21st
ed. Jabalpur(India): Banarsidas Bhanot Publishers; 2011
• Gordis, Leon Epidemiology / Leon Gordis.—4th ed.
Philadelphia: Saunders Elsevier; 2009
• Roger Detels et al. Oxford Text Book of Public Health. 5th
ed. New york(U.S.A): Oxford University Press; 2011
• K R Sundaram. Medical statistics principles & methods. 1st
ed. New Delhi(India): BI Publications Pvt Ltd; 2010