AND ERRORS IN RESEARCH
All analytic studies must begin with
a clearly formulated hypothesis. The
hypothesis must be quantitative and
specific. It must predict a
relationship of a specific size.
• For example:
“Babies who are breast-fed have less
illness than babies who are bottle-fed.”
Which illnesses? How is feeding type
defined? How large a difference in risk?
• A better example:
“Babies who are exclusively breast-fed for
three months or more will have a
reduction in the incidence of hospital
admissions for gastroenteritis of at least
30% over the first year of life.”
Only specific prediction allows one to
draw legitimate conclusions from a
study which tests a hypothesis. But
even with the best formulated
hypothesis, two types of errors can
• Type 1 - observing a difference when
in truth there is none.
• Type 2 - failing to observe a difference
when there is one.
These errors are generally produced
by one or more of the following:
• RANDOM ERROR
• RANDOM MISCLASSIFICATION
Deviation of results and inferences
from the truth, occurring only as a
result of the operation of chance.
Can produce type 1 or type 2 errors.
RANDOM (OR NON-
Random error applied to the
measurement of an exposure or
outcome. Errors in classification can
only produce type 2 errors, except if
applied to a confounder or to an
Systematic, non-random deviation of
results and inferences from the truth, or
processes leading to such deviation. Any
trend in the collection, analysis,
interpretation, publication or review of
data that can lead to conclusions which
are systematically different from the truth.
(Dictionary of Epidemiology, 3rd ed.)
MORE ON BIAS
Note that in bias, the focus is on an
artifact of some part of the research
process (assembling subjects,
collecting data, analyzing data) that
produces a spurious result. Bias can
produce either a type 1 or a type 2
error, but we usually focus on type 1
errors due to bias.
MORE ON BIAS
Bias can be either conscious or
unconscious. In epidemiology, the
word bias does not imply, as in
common usage, prejudice or
deliberate deviation from the truth.
A problem resulting from the fact that one
feature of study subjects has not been
separated from a second feature, and has
thus been confounded with it, producing a
spurious result. The spuriousness arises
from the effect of the first feature being
mistakenly attributed to the second feature.
Confounding can produce either a type 1
or a type 2 error, but we usually focus on
type 1 errors.
THE DIFFERENCE BETWEEN
BIAS AND CONFOUNDING
Bias creates an association that is
not true, but confounding describes
an association that is true, but
EXAMPLES OF RANDOM ERROR,
BIAS, MISCLASSIFICATION AND
CONFOUNDING IN THE SAME
STUDY: In a cohort study, babies of
women who bottle feed and women who
breast feed are compared, and it is
found that the incidence of
gastroenteritis, as recorded in medical
records, is lower in the babies who are
EXAMPLE OF RANDOM
By chance, there are more episodes of
gastroenteritis in the bottle-fed group in
the study sample, producing a type 1
error. (When in truth breast feeding is
not protective against gastroenteritis).
Or, also by chance, no difference in risk
was found, producing a type 2 error
(When in truth breast feeding is
protective against gastroenteritis).
EXAMPLE OF RANDOM
Lack of good information on feeding
history results in some breast-
feeding mothers being randomly
classified as bottle-feeding, and vice-
versa. If this happens, the study
finding underestimates the true RR,
whichever feeding modality is
associated with higher disease
incidence, producing a type 2 error.
EXAMPLE OF BIAS
The medical records of bottle-fed babies
only are less complete (perhaps bottle
fed babies go to the doctor less) than
those of breast fed babies, and thus
record fewer episodes of gastro-enteritis
in them only.
This is called ias because the
observation itself is in error.
EXAMPLE OF CONFOUNDING
The mothers of breast-fed babies are of
higher social class, and the babies thus
have better hygiene, less crowding and
perhaps other factors that protect against
gastroenteritis. Crowding and hygiene
are truly protective against
gastroenteritis, but we mistakenly
attribute their effects to breast feeding.
This is called confounding. because the
observation is correct, but its explanation
PROTECTION AGAINST RANDOM
ERROR AND RANDOM
Random error can work to falsely
produce an association (type 1 error) or
falsely not produce an association (type
We protect ourselves against random
misclassification producing a type 2
error by choosing the most precise and
accurate measures of exposure and
TYPE 1 ERRORS
We protect our study against random
type 1 errors by establishing that the
result must be unlikely to have
occurred by chance (e.g. p < .05).
P-values are established entirely
to protect against type 1 errors due
to chance, and do not guarantee
protection against type 1 errors due
to bias or confounding. This is the
reason we say statistics demonstrate
association but not causation.
TYPE 2 ERRORS
We protect our study against random type
2 errors by
• providing adequate sample size, and
• hypothesizing large differences.
The larger the sample size, the easier it
will be to detect a true difference, and the
largest differences will be the easiest to
detect. (Imagine how hard it would be to
detect a 1% increase in the risk of
gastroenteritis with bottle-feeding).
TWO WAYS TO
The sample size needed to detect a
significant difference is called the power
of a study.
1. Choosing the most precise and accurate
measures of exposure and outcome has
the effect of increasing the power of our
study, because of variances of the
outcome measures, which enter into
statistical testing, are decreased.
2. Having an adequate sized sample of
KEY PRINCIPLE IN BIAS AND
The factor that creates the bias, or
the confounding variable, must be
associated with both the
independent and dependent
variables (i.e. with the exposure and
the disease). Association of the bias
or confounder with just one of the
two variables is not enough to
produce a spurious result.
In the example just given:
The BIAS, namely incomplete chart
recording, has to be associated with
feeding type (the independent variable)
and also with recording of
gastroenteritis (the dependent variable)
to produce the false result.
The CONFOUNDING VARIABLE (or
CONFOUNDER) better hygiene, has to be
associated with feeding type and also
with gastroenteritis to produce the
Were the bias or the confounder
associated with just the independent
variable or just the dependent variable,
they would not produce bias or
This gives a useful rule:
If you can show that a potential
confounder is NOT associated with
either one of the two variables under
study (exposure or outcome),
confounding can be ruled out.
GOOD STUDY DESIGN
PROTECTS AGAINST ALL
FORMS OF ERROR
SOME TYPES OF BIAS
1. SELECTION BIAS
Any aspect of the way subjects are
assembled in the study that creates a
systematic difference between the
compared populations that is not
due to the association under study.
2. INFORMATION BIAS
Any aspect of the way information is collected in
the study that creates a systematic difference
between the compared populations that is not due
to the association under study. (some call this
measurement bias). The incomplete chart
recording in the baby feeding example would be a
form of information bias.
Other examples -
• Diagnostic suspicion bias
• Recall bias
Sometimes biases apply to a population of
studies, rather than to one study, as in publication
bias (tendency to publish papers which show