Biases and errors in
Epidemiology
Anchita Khatri
Definitions
ERROR:
1. A false or mistaken result obtained in a study
or experiment
2. Random error is the portion of varia...
Relationship b/w Bias and Chance
Chance
Bias
Diastolic Blood Pressure (mm Hg)
80 90
True BP
(intra-arterial
cannula)
BP me...
Validity
• Validity: The degree to which a
measurement measures what it purports to
measure (Last)
Degree to which the dat...
Reliability
• The degree of stability expected when a
measurement is repeated under identical
conditions; degree to which ...
Validity and Reliability
VALIDITY
RELIABILITY
High Low
High
Low
Bias
• Deviation of results or inferences from the truth,
or processes leading to such deviation. Any
trend in the collect...
Types of biases
1. Selection bias
2. Measurement / (mis)classification bias
3. Confounding bias
Selection bias
• Errors due to systematic differences in
characteristics between those who are
selected for study and thos...
Examples of Selection bias
• Subjects: hospital cases under the care of a
physician
• Excluded:
1. Die before admission – ...
Ascertainment Bias
• Systematic failure to represent equally all
classes of cases or persons supposed to be
represented in...
Selection bias with ‘volunteers’
• Also known as ‘response bias’
• Systematic error due to differences in
characteristics ...
Examples …response bias
• Volunteer either because they are unwell, or
worried about an exposure
• Respondents to ‘effects...
Examples…. (Assembly bias)
• Study: ? association b/w reserpine and breast
cancer in women
• Design: Case Control
• Cases:...
Examples…. (Assembly bias)
• Study: effectiveness of OCP1 vs. OCP2
• Subjects:
on OCP1 – women who had given birth at leas...
Susceptibility Bias
• Groups being compared are not equally
susceptible to the outcome of interest, for
reasons other than...
Examples…..(Susceptibility Bias)
• Background: for colorectal cancer,
- CEA levels correlated with extent of disease
(Duke...
Example… CEA levels (contd.)
• Answer: association of pre-op levels of
CEA to disease relapse was observed for
each catego...
Disease-free survival according to CEA
levels in colorectal cancer pts.with similar
pathological staging (Duke’s B)
3 96 1...
Selection bias with ‘Survival Cohorts’
• Patients are included in study because they are
available, and currently have the...
Example… bias with ‘survival cohort’
Assemble
Cohort
(N=150)
Measure outcome
Improved: 75
Not improved: 75
Assemble
patien...
Selection bias due to ‘Loss to
Follow-up’
• Also known as ‘Migration Bias’
• In nearly all large studies some members of
t...
Example of ‘lost to follow-up’
+nt -nt Total
+nt 50 100 150
-nt
10000 20000 30000
EXPOSURE
irradiation
+nt -nt Total
+nt 6...
Migration bias
• A form of Selection Bias
• Can occur when patients in one group leave
their original group, dropping out ...
Example of migration
• Question: relationship between lifestyle and
mortality
• Subjects: 10,269 Harvard College alumni
- ...
Example of migration (contd.)
• Problem: original classification of ‘lifestyle’
might change (migration b/w groups)
• Solu...
Example of migration (contd.)
• Result: after controlling for other risk
factors
- those who maintained or adopted high ri...
Healthy worker effect
• A phenomenon observed initially in studies of
occupational diseases: workers usually
exhibit lower...
Example…. ‘healthy worker effect’
• Question: association b/w formaldehyde
exposure and eye irritation
• Subjects: factory...
Measurement bias
• Systematic error arising from inaccurate
measurements (or classification) of subjects or
study variable...
Measurement / (Mis) classification
• Exposure misclassification occurs when
exposed subjects are incorrectly classified
as...
Causes of misclassification
1. Measurement gap: gap between the
measured and the true value of a variable
- Observer / int...
Sources of misclassification
Measurement results
Empirical definition
Theoretical definition
Measurement errors
Gap b/w th...
Example… ‘gap b/w definitions’
Theoretical definition
• Exposure: passive
smoking – inhalation of
tobacco smoke from
other...
Exposure misclassification –
Non-differential
• Misclassification does not differ between
cases and non-cases
• Generally ...
Example…Non-differential
Exposure Misclassification
+nt -nt Total
+nt 40 80 120
-nt
10000 40000 50000
+nt -nt Total
+nt 60...
Exposure misclassification -
Differential
• Misclassification differs between cases and
non-cases
• Introduces a bias towa...
Example…Differential Exposure
Misclassification
+nt -nt Total
+nt 40 80 120
-nt 9960 39920 49880
10000 40000 50000
+nt -nt...
Implications of Differential
exposure misclassification
• An improvement in accuracy of exposure
information (i.e. no misc...
Causes of Differential Exposure
Misclassification
• Recall Bias:Systematic error due to
differences in accuracy or complet...
Causes of Differential Exposure
Misclassification
• Measurement bias:
e.g. analysis of Hb by different methods
(cyanmethem...
Causes of Differential Exposure
Misclassification
• Interviewer / observer bias: systematic error
due to observer variatio...
Measurement bias in treatment
effects
• Hawthorne effect: effect (usually positive /
beneficial) of being under study upon...
Total effects of treatment are the sum of
spontaneous improvement, non-specific responses,
and the effects of specific tre...
Confounding
1. A situation in which the effects of two
processes are not separated. The distortion
of the apparent effect ...
Confounding
When another exposure exists in the study
population (besides the one being studied)
and is associated both wi...
Confounder … must be
1. Risk factor among the unexposed (itself a
determinant of disease)
2. Associated with the exposure ...
Examples … confounding
SMOKING LUNG CANCER
AGE
(If the average ages of the smoking and
non-smoking groups are very differe...
Examples … confounding
COFFEE DRINKING HEART DISEASE
SMOKING
(Coffee drinkers are
more likely to smoke)
(Smoking increases...
Examples … confounding
ALCOHOL
INTAKE
MYOCARDIAL
INFARCTION
SEX
(Men are more at risk
for MI)
(Men are more likely
to cons...
Examples … confounding
+nt -nt
+nt 140 100
-nt
Total 30000 30000
+nt -nt
male female male female
+nt 120 20 60 40
-nt
Tota...
Example … multiple biases
• Study: ?? Association b/w regular exercise and
risk of CHD
• Methodology: employees of a plant...
Biases operating
• Selection: volunteers might have had initial
lower risk (e.g. lower lipids etc.)
• Measurement: exercis...
Dealing with Selection Bias
Ideally,
To judge the effect of an exposure / factor on
the risk / prognosis of disease, we sh...
Methods for controlling Selection Bias
During Study Design
1. Randomization
2. Restriction
3. Matching
During analysis
1. ...
Restriction
• Subjects chosen for study are restricted to
only those possessing a narrow range of
characteristics, to equa...
Example… restriction
• Study: effect of age on prognosis of MI
• Restriction: Male / White / Uncomplicated
anterior wall M...
Example… restriction
• OCP example
restrict study to women having at least one
child
• Colorectal cancer example
restrict ...
Matching - definition
• The process of making a study group and a
comparison group comparable with respect to
extraneous f...
Types of Matching
• Caliper matching: process of matching
comparison group to study group within a
specific distance for a...
Types of Matching … (contd.)
• Individual matching: identifying individual
subjects for comparison, each resembling a
stud...
• Matching is often done for age, sex, race, place
of residence, severity of disease, rate of
progression of disease, prev...
Example… Matching
• Study: ? Association of Sickle cell trait (HbAS)
with defects in physical growth and cognitive
develop...
Overmatching
A situation that may arise when groups are being
matched. Several varieties:
1. The matching procedure partia...
2. The matching procedure uses one or more
unnecessary matching variables, e.g., variables
that have no causal effect or i...
Stratification
• The process of or the result of separating a
sample into several sub-samples according
to specified crite...
Example…Stratification (Fletcher)
Pts Deaths %
Total 1200 48 4
Pre-op
Risk
High
Medium
Low
500
400
300
30 6
16 4
02 .67
Pt...
Example…Stratification
Pinellas county Dade county
Relat.
Rate
Dead Total Rate Dead Total Rate
Overall 5726 374,665 15.3 8...
Standardization
A set of techniques used to remove as far as
possible the effects of differences in age or
other confoundi...
Standard population
A population in which the age and sex
composition is known precisely, as a result
of a census or by an...
Types of standardization
Direct: the specific rates in a study population
are averaged using as weights the
distribution o...
Indirect: used to compare the study populations
for which the specific rates are either
statistically unstable or unknown....
Standardized mortality ratio
(SMR)
Ratio of
The no. of deaths observed in the
study group or population
X
100
No. of death...
Example … direct standardization
Age Pop Deaths Rate
0 4000 60 15.0
1-4 4500 20 4.4
5-14 4000 12 3.0
15-19 5000 15 3.0
20-...
Example … direct standardization
Preop Pts Deaths %
High 500 30 6
Medium 400 16 4
Low 300 2 .67
Total 1200 48 4
Preop Pts ...
Stratification vs. Standardization
• Standardization removes the effect
• Stratification controls for the effect of factor...
Multivariate adjustment
• Simultaneously controlling the effects of
many variables to determine the independent
effects of...
Examples… Multivariate
adjustment
• CHD is the joint result of lipid
abnormalities, HT, smoking, family history,
DM, exerc...
Example…Multivariate adjustment
• Multi variable modeling i.e developing a
mathematical expression of the effects of many
...
Sensitivity analysis
• When data on important prognostic factors
is not available, it is possible to estimate the
potentia...
Example… best/worst case analysis
• Study: effect of gastro-gastrostomy on
morbid obesity
• Subjects: cohort of 123 morbid...
Example…. (contd.)
• Success rate: 60/103 (58%)
• Best case: all 20 lost to follow up had
“success”
Best success rate: (60...
Randomization
• The only way to equalize all extraneous
factors, or ‘everything else’ is to assign
patients to groups rand...
Overall strategy
• Except for randomization, all ways of
dealing with extraneous differences b/w
groups. Are effective aga...
Example…
• Study: effect of presence of VPCs on survival of
patients after acute MI
• Strategies:
- Restriction: not too y...
Dealing with measurement bias
1. Blinding
- Subject
- Observer / interviewer
- Analyser
2. Strict definition / standard de...
Controlling confounding
• Similar to controlling for selection bias
• Use randomization, restriction, matching,
stratifica...
Lead time bias
• Lead time is the period of time b/w the
detection of a medical condition by screening
and when it ordinar...
How lead time affects survival time
Diag
Diag
Diag
Unscreened
Screened –
Early T/t not effective
Screened –
Early T/t is e...
Controlling lead time bias
• Compare screened group of people, and
control group, and compare age specific
mortality rates...
Length time bias
• Can affect studies of screening
• B’cos the proportion of slow growing tumors
diagnosed during screenin...
Compliance bias
• Compliant patients tend to have better
prognoses regardless of the screening
• If a study compares disea...
Types of studies & related biases
Prevalence study •Uncertainty about temporal sequences
•Bias studying ‘old’/prevalent ca...
Random error
• Divergence on the basis of chance alone of
an observation on a sample from the
population from the true pop...
Sources of random error
1. Individual biological variation
2. Measurement error
3. Sampling error ( the part of the total
...
Sampling variation
Because research must ordinarily be
conducted on a sample of patients and not
on all the patients with ...
Sampling variation - definition
Since inclusion of individuals in a sample is
determined by chance, the results of
analysi...
Assessing the role of chance
1. Hypothesis testing
2. Estimation
Hypothesis testing
Start off with the Null Hypothesis (H0
)
the statistical hypothesis that one variable
has no associatio...
Statistical tests – errors
(Fletcher)
TRUE DIFFERENCE
PRESENT
(H0
) false
ABSENT
(H0
) true
CONCLUSION
OF
STATISTICAL
TEST...
Statistical tests - errors
• Type I (α) error: error of rejecting a true
null hypothesis , I.e. declaring a difference
exi...
p - value
• Probability of an α error.
• Quantitative estimate of probability that
observed difference in b/w the groups i...
p – value – Remember!!
• Usually P < 0.05 is considered statistically
significant (i.e. probability of 1 in 20 that
observ...
Statistical significance vs.
clinical significance
Large RCT called GUSTO (41,021 pts of ac MI)
• Study: Streptokinase vs....
Estimation
• Effect size observed in a particular study is
called ‘Point estimate’
• True effect is unlikely to be exactly...
Confidence intervals
(Fletcher) If the study is unbiased, there is a 95%
chance that that the interval includes the true
e...
Multiple comparison problem
• If a no. of comparisons are made, (e.g. in a
large study, the effect of treatment assessed
s...
“If you dredge the data sufficiently deeply,
and sufficiently often, you will find
something odd. Many of these bizarre
fi...
Dealing with random error
• Increasing the sample size: sample size
depends upon
- level of statistical significance (α er...
References
1. Fletcher RH et al.Clinical Epidemiology : The
Essentials – 3rd
ed.
2. Beaglehole R et al. Basic Epidemiology...
Bias & error
Upcoming SlideShare
Loading in...5
×

Bias & error

786

Published on

Published in: Health & Medicine
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
786
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
55
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Bias & error

  1. 1. Biases and errors in Epidemiology Anchita Khatri
  2. 2. Definitions ERROR: 1. A false or mistaken result obtained in a study or experiment 2. Random error is the portion of variation in measurement that has no apparent connection to any other measurement or variable, generally regarded as due to chance 3. Systematic error which often has a recognizable source, e.g., a faulty measuring instrument, or pattern, e.g., it is consistently wrong in a particular direction (Last)
  3. 3. Relationship b/w Bias and Chance Chance Bias Diastolic Blood Pressure (mm Hg) 80 90 True BP (intra-arterial cannula) BP measurement (sphygmomanometer) No.ofobservations
  4. 4. Validity • Validity: The degree to which a measurement measures what it purports to measure (Last) Degree to which the data measure what they were intended to measure – that is, the results of a measurement correspond to the true state of the phenomenon being measured (Fletcher) • also known as ‘Accuracy’
  5. 5. Reliability • The degree of stability expected when a measurement is repeated under identical conditions; degree to which the results obtained from a measurement procedure can be replicated (Last) • Extent to which repeated measurements of a stable phenomenon – by different people and instruments, at different times and places – get similar results (Fletcher) • Also known as ‘Reproduciblity’ and ‘Precision’
  6. 6. Validity and Reliability VALIDITY RELIABILITY High Low High Low
  7. 7. Bias • Deviation of results or inferences from the truth, or processes leading to such deviation. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth. (Last) • A process at any stage of inference tending to produce results that depart systematically from true values (Fletcher)
  8. 8. Types of biases 1. Selection bias 2. Measurement / (mis)classification bias 3. Confounding bias
  9. 9. Selection bias • Errors due to systematic differences in characteristics between those who are selected for study and those who are not. (Last; Beaglehole) • When comparisons are made between groups of patients that differ in ways other than the main factors under study, that affect the outcome under study. (Fletcher)
  10. 10. Examples of Selection bias • Subjects: hospital cases under the care of a physician • Excluded: 1. Die before admission – acute/severe disease. 2. Not sick enough to require hospital care 3. Do not have access due to cost, distance etc. • Result: conclusions cannot be generalized • Also known as ‘Ascertainment Bias’
  11. 11. Ascertainment Bias • Systematic failure to represent equally all classes of cases or persons supposed to be represented in a sample. This bias may arise because of the nature of the sources from which the persons come, e.g., a specialized clinic; from a diagnostic process influenced by culture, custom, or idiosyncracy. (Last)
  12. 12. Selection bias with ‘volunteers’ • Also known as ‘response bias’ • Systematic error due to differences in characteristics b/w those who choose or volunteer to take part in a study and those who do not
  13. 13. Examples …response bias • Volunteer either because they are unwell, or worried about an exposure • Respondents to ‘effects of smoking’ usually not as heavy smokers as non-respondents. • In a cohort study of newborn children, the proportion successfully followed up for 12 months varied according to the income level of the parents
  14. 14. Examples…. (Assembly bias) • Study: ? association b/w reserpine and breast cancer in women • Design: Case Control • Cases: Women with breast cancer Controls: Women without breast cancer who were not suffering from any cardio-vascular disease (frequently associated with HT) • Result: Controls likely to be on reserpine systematically excluded  association between reserpine and breast cancer observed
  15. 15. Examples…. (Assembly bias) • Study: effectiveness of OCP1 vs. OCP2 • Subjects: on OCP1 – women who had given birth at least once ( able to conceive) on OCP2 – women had never become pregnant • Result: if OCP2 found to be better, inference correct??
  16. 16. Susceptibility Bias • Groups being compared are not equally susceptible to the outcome of interest, for reasons other than the factors under study • Comparable to ‘Assembly Bias’ • In prognosis studies; cohorts may differ in one or more ways – extent of disease, presence of other diseases, the point of time in the course of disease, prior treatment etc.
  17. 17. Examples…..(Susceptibility Bias) • Background: for colorectal cancer, - CEA levels correlated with extent of disease (Duke’s classification) - Duke’s classification and CEA levels strongly predicted diseases relapse • Question: Does CEA level predict relapse independent of of Duke’s classification, or was susceptibility to relapse explained by Duke’s classification alone?
  18. 18. Example… CEA levels (contd.) • Answer: association of pre-op levels of CEA to disease relapse was observed for each category of Duke’s classification stratification
  19. 19. Disease-free survival according to CEA levels in colorectal cancer pts.with similar pathological staging (Duke’s B) 3 96 1512 2118 24 80 60 100 0 CEA Level (ng) <2.5 2.5 – 10.0 >10.0 Months %diseasefree
  20. 20. Selection bias with ‘Survival Cohorts’ • Patients are included in study because they are available, and currently have the disease • For lethal diseases patients in survival cohort are the ones who are fortunate to have survived, and so are available for observation • For remitting diseases patients are those who are unfortunate enough to have persistent disease • Also known as ‘Available patient cohorts’
  21. 21. Example… bias with ‘survival cohort’ Assemble Cohort (N=150) Measure outcome Improved: 75 Not improved: 75 Assemble patients Begin Follow-up (N=50) Measure outcome Improved: 40 Not improved: 10 Not observed (N=100) Dropouts Improved: 35 Not improved: 65 TRUE COHORT SURVIVAL COHORT Observed improvement True improvement 50% 50% 80% 50%
  22. 22. Selection bias due to ‘Loss to Follow-up’ • Also known as ‘Migration Bias’ • In nearly all large studies some members of the original cohort drop out of the study • If drop-outs occur randomly, such that characteristics of lost subjects in one group are on an average similar to those who remain in the group, no bias is introduced • But ordinarily the characteristics of the lost subjects are not the same
  23. 23. Example of ‘lost to follow-up’ +nt -nt Total +nt 50 100 150 -nt 10000 20000 30000 EXPOSURE irradiation +nt -nt Total +nt 60 -nt 4000 8000 12000 30 30 EXPOSURE irradiation DISEASE cataract RR= 50/10000 100/20000 = 1 RR= 30/4000 30/8000 = 2
  24. 24. Migration bias • A form of Selection Bias • Can occur when patients in one group leave their original group, dropping out of the study altogether or moving to one of the other groups under study (Fletcher) • If occur on a large scale, can affect validity of conclusions. • Bias due to crossover more often a problem in risk studies, than in prognosis studies, because risk studies go on for many years
  25. 25. Example of migration • Question: relationship between lifestyle and mortality • Subjects: 10,269 Harvard College alumni - classified according to physical activity, smoking, weight, BP - In 1966 and 1977 • Mortality rates observed from 1977 to 1985
  26. 26. Example of migration (contd.) • Problem: original classification of ‘lifestyle’ might change (migration b/w groups) • Solution: defined four categories - Men who maintained high risk lifestyles - Men who crossed over from low to high risk - Men who crossed over from high to low risk - Men who maintained low risk lifestyles
  27. 27. Example of migration (contd.) • Result: after controlling for other risk factors - those who maintained or adopted high risk characteristics had highest mortality - Those who changed from high to low had lesser mortality than above - Those who never had any high risk behavior had least mortality
  28. 28. Healthy worker effect • A phenomenon observed initially in studies of occupational diseases: workers usually exhibit lower overall death rates than the general population, because the severely ill and chronically disabled are ordinarily excluded from employment. Death rates in the general population may be inappropriate for comparison if this effect is not taken into account. (Last)
  29. 29. Example…. ‘healthy worker effect’ • Question: association b/w formaldehyde exposure and eye irritation • Subjects: factory workers exposed to formaldehyde • Bias: those who suffer most from eye irritation are likely to leave the job at their own request or on medical advice • Result: remaining workers are less affected; association effect is diluted
  30. 30. Measurement bias • Systematic error arising from inaccurate measurements (or classification) of subjects or study variables. (Last) • Occurs when individual measurements or classifications of disease or exposure are inaccurate (i.e. they do not measure correctly what they are supposed to measure) (Beaglehole) • If patients in one group stand a better chance of having their outcomes detected than those in another group. (Fletcher)
  31. 31. Measurement / (Mis) classification • Exposure misclassification occurs when exposed subjects are incorrectly classified as unexposed, or vice versa • Disease misclassification occurs when diseased subjects are incorrectly classified as non-diseased, or vice versa (Norell)
  32. 32. Causes of misclassification 1. Measurement gap: gap between the measured and the true value of a variable - Observer / interviewer bias - Recall bias - Reporting bias 2. Gap b/w the theoretical and empirical definition of exposure / disease
  33. 33. Sources of misclassification Measurement results Empirical definition Theoretical definition Measurement errors Gap b/w theoretical & empirical definitions
  34. 34. Example… ‘gap b/w definitions’ Theoretical definition • Exposure: passive smoking – inhalation of tobacco smoke from other people’s smoking • Disease: Myocardial infarction – necrosis of the heart muscle tissue Empirical definition • Exposure: passive smoking – time spent with smokers (having smokers as room-mates) • Disease: Myocardial infarction – certain diagnostic criteria (chest pain, enzyme levels, signs on ECG)
  35. 35. Exposure misclassification – Non-differential • Misclassification does not differ between cases and non-cases • Generally leads to dilution of effect, i.e. bias towards RR=1 (no association)
  36. 36. Example…Non-differential Exposure Misclassification +nt -nt Total +nt 40 80 120 -nt 10000 40000 50000 +nt -nt Total +nt 60 60 120 -nt 20000 30000 50000 EXPOSURE X-ray exposure EXPOSURE X-ray exposure DISEASE BreastCancer RR= 40/10000 80/40000 = 2 RR= 60/20000 60/30000 = 1.5
  37. 37. Exposure misclassification - Differential • Misclassification differs between cases and non-cases • Introduces a bias towards RR= 0 (negative / protective association), or RR= α (infinity)(strong positive association)
  38. 38. Example…Differential Exposure Misclassification +nt -nt Total +nt 40 80 120 -nt 9960 39920 49880 10000 40000 50000 +nt -nt Total +nt 40 80 120 -nt 19940 29940 49880 19980 30020 50000 EXPOSURE X-ray exposure EXPOSURE X-ray exposure DISEASE BreastCancer RR= 40/10000 80/40000 = 2 RR= 40/19980 80/30020 = 0.75
  39. 39. Implications of Differential exposure misclassification • An improvement in accuracy of exposure information (i.e. no misclassification among those who had breast cancer), actually reduced accuracy of results • Non-differential misclassification is ‘better’ than differential misclassification • So, epidemiologists are more concerned with comparability of information than with improving accuracy of information
  40. 40. Causes of Differential Exposure Misclassification • Recall Bias:Systematic error due to differences in accuracy or completeness of recall to memory of past events or experience. For e.g. patients suffering from MI are more likely to recall and report ‘lack of exercise’ in the past than controls
  41. 41. Causes of Differential Exposure Misclassification • Measurement bias: e.g. analysis of Hb by different methods (cyanmethemoglobin and Sahli's) in cases and controls. e.g.biochemical analysis of the two groups from two different laboratories, which give consistently different results
  42. 42. Causes of Differential Exposure Misclassification • Interviewer / observer bias: systematic error due to observer variation (failure of the observer to measure or identify a phenomenon correctly) e.g. in patients of thrombo-embolism, look for h/o OCP use more aggressively
  43. 43. Measurement bias in treatment effects • Hawthorne effect: effect (usually positive / beneficial) of being under study upon the persons being studied; their knowledge of being studied influences their behavior • Placebo effect: (usually, but not necessarily beneficial) expectation that regimen will have effect, i.e. the effect is due to the power of suggestion.
  44. 44. Total effects of treatment are the sum of spontaneous improvement, non-specific responses, and the effects of specific treatments Specific to treatment Placebo Hawthorne Natural History EFFECTS IMPROVEMENT
  45. 45. Confounding 1. A situation in which the effects of two processes are not separated. The distortion of the apparent effect of an exposure on risk brought about by the association with other factors that can influence the outcome 2. A relationship b/w the effects of two or more causal factors as observed in a set of data such that it is not logically possible to separate the contribution that any single causal factor has made to an effect (Last)
  46. 46. Confounding When another exposure exists in the study population (besides the one being studied) and is associated both with disease and the exposure being studied. If this extraneous factor – itself a determinant of or risk factor for health outcome is unequally distributed b/w the exposure subgroups, it can lead to confounding (Beaglehole)
  47. 47. Confounder … must be 1. Risk factor among the unexposed (itself a determinant of disease) 2. Associated with the exposure under study 3. Unequally distributed among the exposed and the unexposed groups
  48. 48. Examples … confounding SMOKING LUNG CANCER AGE (If the average ages of the smoking and non-smoking groups are very different) (As age advances chances of lung cancer increase)
  49. 49. Examples … confounding COFFEE DRINKING HEART DISEASE SMOKING (Coffee drinkers are more likely to smoke) (Smoking increases the risk of heart ds)
  50. 50. Examples … confounding ALCOHOL INTAKE MYOCARDIAL INFARCTION SEX (Men are more at risk for MI) (Men are more likely to consume alcohol than women)
  51. 51. Examples … confounding +nt -nt +nt 140 100 -nt Total 30000 30000 +nt -nt male female male female +nt 120 20 60 40 -nt Total 20000 10000 10000 20000 Exposure-alcohol Exposure-alcohol Disease MI Disease MI RR = 140/30000 100/30000 = 1.4 RR = 120/20000 (M) 60/10000 = 1 RR = 20/10000 (F) 40/20000 = 1
  52. 52. Example … multiple biases • Study: ?? Association b/w regular exercise and risk of CHD • Methodology: employees of a plant offered an exercise program; some volunteered, others did not coronary events detected by regular voluntary check-ups, including a careful history, ECG, checking routine heath records • Result: the group that exercised had lower CHD rates
  53. 53. Biases operating • Selection: volunteers might have had initial lower risk (e.g. lower lipids etc.) • Measurement: exercise group had a better chance of having a coronary event detected since more likely to be examined more frequently • Confounding: if exercise group smoked cigarettes less, a known risk factor for CHD
  54. 54. Dealing with Selection Bias Ideally, To judge the effect of an exposure / factor on the risk / prognosis of disease, we should compare groups with and without that factor, everything else being equal But in real life ‘everything else’ is usually not equal
  55. 55. Methods for controlling Selection Bias During Study Design 1. Randomization 2. Restriction 3. Matching During analysis 1. Stratification 2. Adjustment a) Simple / standardization b) Multiple / multivariate adjustment c) Best case / worst case analysis
  56. 56. Restriction • Subjects chosen for study are restricted to only those possessing a narrow range of characteristics, to equalize important extraneous factors • Limitation: generalisability is compromised; by excluding potential subjects, cohorts / groups selected may be unusual and not representative of most patients or people with condition
  57. 57. Example… restriction • Study: effect of age on prognosis of MI • Restriction: Male / White / Uncomplicated anterior wall MI • Important extraneous factors controlled for: sex / race / severity of disease • Limitation: results not generalizable to females, people of non-white community, those with complicated MI
  58. 58. Example… restriction • OCP example restrict study to women having at least one child • Colorectal cancer example restrict patients to a particular staging of Duke’s classification
  59. 59. Matching - definition • The process of making a study group and a comparison group comparable with respect to extraneous factors (Last) • For each patient in one group there are one or more patients in the comparison group with same characteristics, except for the factor of interest (Fletcher)
  60. 60. Types of Matching • Caliper matching: process of matching comparison group to study group within a specific distance for a continuous variable (e.g., matching age to within 2 years) • Frequency matching: frequency distributions of the matched variable(s) be similar in study and comparison groups • Category matching: matching the groups in broad classes such as relatively wide age ranges or occupational groups
  61. 61. Types of Matching … (contd.) • Individual matching: identifying individual subjects for comparison, each resembling a study subject on the matched variable(s) • Pair matching: individual matching in which the study and comparison subjects are paired (Last)
  62. 62. • Matching is often done for age, sex, race, place of residence, severity of disease, rate of progression of disease, previous treatment received etc. • Limitations: - controls for bias for only those factors involved in the match - Usually not possible to match for more than a few factors because of the practical difficulties of finding patients that meet all matching criteria - If categories for matching are relatively crude, there may be room for substantial differences b/w matched groups
  63. 63. Example… Matching • Study: ? Association of Sickle cell trait (HbAS) with defects in physical growth and cognitive development • Other potential biasing factors: race, sex, birth date, birth weight, gestational age, 5-min Apgar score, socio economic status • Solution: matching – for each child with HbAS selected a child with HbAA who was similar with respect to the seven other factors (50+50=100) • Result: no difference in growth and development
  64. 64. Overmatching A situation that may arise when groups are being matched. Several varieties: 1. The matching procedure partially or completely obscures evidence of a true causal association b/w the independent and dependant variables. Overmatching may occur if the matching variable is involved in, or is closely connected with, the mechanism whereby the independent variable affects the dependant variable. The matching variable may be an intermediate cause in the causal chain or it may be strongly affected by, or a consequence of, such an intermediate cause
  65. 65. 2. The matching procedure uses one or more unnecessary matching variables, e.g., variables that have no causal effect or influence on the dependant variable, and hence cannot confound the relationship b/w the independent and dependant variables. 3. The matching process is unduly elaborate, involving the use of numerous matching variables and / or insisting on a very close similarity with respect to specific matching variables. This leads to difficulty in finding suitable controls (Last)
  66. 66. Stratification • The process of or the result of separating a sample into several sub-samples according to specified criteria such as age groups, socio-economic status etc. (Last) • The effect of confounding variables may be controlled by stratifying the analysis of results • After data are collected, they can be analyzed and results presented according to subgroups of patients, or strata, of similar characteristics (Fletcher)
  67. 67. Example…Stratification (Fletcher) Pts Deaths % Total 1200 48 4 Pre-op Risk High Medium Low 500 400 300 30 6 16 4 02 .67 Pts Deaths % Total 2400 64 2.6 Pre-op Risk High Medium Low 400 24 6 800 32 4 1200 8 .67 HOSPITAL ‘A’ HOSPITAL ‘B’
  68. 68. Example…Stratification Pinellas county Dade county Relat. Rate Dead Total Rate Dead Total Rate Overall 5726 374,665 15.3 8332 935,047 8.9 1.7 > 55 yrs Birth – 54 yrs Age – Wise Stratifi cation 737 229,198 3.2 2463 748,035 3.3 1.0 4989 145,147 34.4 5898 187,985 31.2 1.1
  69. 69. Standardization A set of techniques used to remove as far as possible the effects of differences in age or other confounding variables when comparing two or more populations The method uses weighted averaging of rates specific for age, sex, or some other potentially confounding variable(s), according to some specified distribution of these variables (Last)
  70. 70. Standard population A population in which the age and sex composition is known precisely, as a result of a census or by an arbitrary means – e.g. an imaginary population, the “standard million” in which the age and sex composition is arbitrary. A standard population is used as comparison group in the actuarial procedure of standardization of mortality rates. (e.g. Segi world population, European standard population) (Last)
  71. 71. Types of standardization Direct: the specific rates in a study population are averaged using as weights the distribution of a specified standard population. The standardized rate so obtained represents what the rate would have been in the study population if that population had the same distribution as the standard population w.r.t. the variables for which the adjustment or standardization was carried out.
  72. 72. Indirect: used to compare the study populations for which the specific rates are either statistically unstable or unknown. The specific rates are averaged using as weights the distribution of the study population. The ratio of the crude rate for the study population to the weighted average so obtained is known as standardized mortality (or morbidity) ratio, or SMR. (Last) [represents what the rate would have been in the study population if that population had the same specific rates as the standard population]
  73. 73. Standardized mortality ratio (SMR) Ratio of The no. of deaths observed in the study group or population X 100 No. of deaths expected if the study population had the same specific rates as the standard population
  74. 74. Example … direct standardization Age Pop Deaths Rate 0 4000 60 15.0 1-4 4500 20 4.4 5-14 4000 12 3.0 15-19 5000 15 3.0 20-24 4000 16 4.0 25-34 8000 25 3.1 34-44 9000 48 5.3 45-54 8000 100 12.5 55-64 7000 150 21.4 Total 53,500 446 8.3 Std.Pop Exp deaths 2400 36 9600 42.24 19000 57 9000 27 8000 32 14000 43.4 12000 63.6 11000 137.5 8000 171.2 93000 609.94(6.56)
  75. 75. Example … direct standardization Preop Pts Deaths % High 500 30 6 Medium 400 16 4 Low 300 2 .67 Total 1200 48 4 Preop Pts Rate Exp.deaths High 400 6 24 Medium 400 4 16 Low 400 .67 2.68 Total 1200 42.68 (3.6%) HOSPITAL ‘A’ HOSPITAL ‘Std’
  76. 76. Stratification vs. Standardization • Standardization removes the effect • Stratification controls for the effect of factor, but the effect can still be seen • For e.g. in the ‘hospital example’, with standardization we found that patients had similar prognosis in both hospitals; with stratification also learnt mortality rates among different risk strata • Similar to difference b/w age-standardized mortality rate and age specific mortality rates
  77. 77. Multivariate adjustment • Simultaneously controlling the effects of many variables to determine the independent effects of one • Can select from a large no. of variables a smaller subset that independently and significantly contributes to the overall variation in outcome, and can arrange variables in order of the strength of their contribution • Only feasible way to deal with many variables at one time during the analysis phase
  78. 78. Examples… Multivariate adjustment • CHD is the joint result of lipid abnormalities, HT, smoking, family history, DM, exercise, personality type. • Start with 2x2 tables using one variable at a time • Contingency tables, i.e. stratified analyses, examining the effect of one variable changed in the presence/absence of one or more variables
  79. 79. Example…Multivariate adjustment • Multi variable modeling i.e developing a mathematical expression of the effects of many variables taken together • Basic structure of a multivariate model: Outcome variable = constant + (β1 x variable1) + (β2 x variable2) + ………. • β1, β2, … are coefficients determined from the data; variable1, variable2, …. are the predictor variables that might be related to outcome
  80. 80. Sensitivity analysis • When data on important prognostic factors is not available, it is possible to estimate the potential effects on the study by assuming various degrees of mal-distribution of the factors b/w the groups being compared and seeing how that would affect the results • Best case / worst case analysis is a special type of sensitivity analysis – assuming the best and worst type of mal-distribution
  81. 81. Example… best/worst case analysis • Study: effect of gastro-gastrostomy on morbid obesity • Subjects: cohort of 123 morbidly obese patients who underwent gastro-gastrostomy, 19 to 47 months after surgery • Success : losing >30% excess weight • Follow-up: 103 (84%) patients 20 patients lost to follow up
  82. 82. Example…. (contd.) • Success rate: 60/103 (58%) • Best case: all 20 lost to follow up had “success” Best success rate: (60+20)/123 (65%) • Worst case: all 20 lost to follow up had “failures” Worst success rate: 60/123 (49%) • Result: true success rate b/w 49% and 65%; probably closer to 58% ! (because pts. lost to follow up unlikely to be all successes or all failures
  83. 83. Randomization • The only way to equalize all extraneous factors, or ‘everything else’ is to assign patients to groups randomly so that each has an equal chance of falling into the exposed or unexposed group • Equalizes even those factors which we might not know about! • But it is not possible always
  84. 84. Overall strategy • Except for randomization, all ways of dealing with extraneous differences b/w groups. Are effective against only those factors that are singled out for consideration • Ordinarily one uses several methods layered one upon another
  85. 85. Example… • Study: effect of presence of VPCs on survival of patients after acute MI • Strategies: - Restriction: not too young / old; no unusual causes (e.g.mycotic aneurysm) for infarction - Matching: for age (as important prognostic factor, but not the factor under study) - Stratification: examine results for different strata of clinical severity - Multivariate analysis: adjust crude rates for the effects of all other variables except VPC, taken together.
  86. 86. Dealing with measurement bias 1. Blinding - Subject - Observer / interviewer - Analyser 2. Strict definition / standard definition for exposure / disease / outcome 3. Equal efforts to discover events equally in all the groups
  87. 87. Controlling confounding • Similar to controlling for selection bias • Use randomization, restriction, matching, stratification, standardization, multivariate analysis etc.
  88. 88. Lead time bias • Lead time is the period of time b/w the detection of a medical condition by screening and when it ordinarily would be diagnosed because a pt. experiences symptoms and seeks medical care • As a result of screening, on an average, pt will survive longer from the time of diagnosis than who are diagnosed otherwise, even if T/t is not effective. • Not more ‘survival time’, but more ‘disease time’
  89. 89. How lead time affects survival time Diag Diag Diag Unscreened Screened – Early T/t not effective Screened – Early T/t is effective Onset of Ds Death Survival after diagnosis
  90. 90. Controlling lead time bias • Compare screened group of people, and control group, and compare age specific mortality rates, rather than survival times from time of diagnoses • E.g. early diagnosis and T/t for colorectal cancer is effective because mortality rates of screened people are lower than those of a comparable group of unscreened people
  91. 91. Length time bias • Can affect studies of screening • B’cos the proportion of slow growing tumors diagnosed during screening programs is greater than those diagnosed during usual medical care • B’cos slow growing tumors are present for a longer period before they cause symptoms; fast growing tumors are likely to cause symptoms leading to interval diagnosis • Screening tends to find tumors with inherently better prognoses
  92. 92. Compliance bias • Compliant patients tend to have better prognoses regardless of the screening • If a study compares disease outcomes among volunteers for a screening program with outcomes in a group of people who did not volunteer, better results for the volunteers might not be due to T/t but due to factors related to compliance • Compliance bias and length-time bias can both be avoided by relying on RCTs
  93. 93. Types of studies & related biases Prevalence study •Uncertainty about temporal sequences •Bias studying ‘old’/prevalent cases Case control •Selection bias in selecting cases/controls •Measurement bias Cohort study •Susceptibility bias •Survival cohort vs. true cohort •Migration bias Randomized control trials •Consider natural h/o disease, Hawthorne effect, placebo effect etc. •Compliance problems •Effect of co-interventions
  94. 94. Random error • Divergence on the basis of chance alone of an observation on a sample from the population from the true population values • ‘random’ because on an average it is as likely to result in observed values being on one side of the true value as on the other side • Inherent in all observations • Can be minimized, but never avoided altogether
  95. 95. Sources of random error 1. Individual biological variation 2. Measurement error 3. Sampling error ( the part of the total estimation of error of a parameter caused by the random nature of the sample)
  96. 96. Sampling variation Because research must ordinarily be conducted on a sample of patients and not on all the patients with the condition under study always a possibility that the particular sample of patients in a study, even though selected in an unbiased way, might not be similar to population of patients as a whole
  97. 97. Sampling variation - definition Since inclusion of individuals in a sample is determined by chance, the results of analysis on two or more samples will differ purely by chance. (Last)
  98. 98. Assessing the role of chance 1. Hypothesis testing 2. Estimation
  99. 99. Hypothesis testing Start off with the Null Hypothesis (H0 ) the statistical hypothesis that one variable has no association with another variable or set of variables, or that two or more population distributions do not differ from one another. in simpler terms, the null hypothesis states that the results observed in a study, experiment or test are no different from what might have occurred as a result of operation of chance alone
  100. 100. Statistical tests – errors (Fletcher) TRUE DIFFERENCE PRESENT (H0 ) false ABSENT (H0 ) true CONCLUSION OF STATISTICAL TEST SIGNIFICANT (H0 ) Rejected NOT SIGNIFICANT (H0 ) Accepted Type I ( α ) error Type II ( β ) error Power
  101. 101. Statistical tests - errors • Type I (α) error: error of rejecting a true null hypothesis , I.e. declaring a difference exists when it does not • Type II (β) error: error of failing to reject a false null hypothesis , I.e. declaring that a difference does not exist when in fact it does • Power of a study: ability of a study to demonstrate an association if one exists Power = 1- β
  102. 102. p - value • Probability of an α error. • Quantitative estimate of probability that observed difference in b/w the groups in the study could have happened by chance alone, assuming that there is no real difference b/w the groups OR • If there were no difference b/w the groups, and the trial was repeated many times, what proportion of the trials would lead to conclusions that there is the same or a bigger difference b/w the groups than the results found in the study
  103. 103. p – value – Remember!! • Usually P < 0.05 is considered statistically significant (i.e. probability of 1 in 20 that observed difference is due to chance) • 0.05 is an arbitrary cut-off; can change according to requirements • Statistically significant result might not be clinically significant and vice-versa
  104. 104. Statistical significance vs. clinical significance Large RCT called GUSTO (41,021 pts of ac MI) • Study: Streptokinase vs. tPA • Result: death rate at 30 days - streptokinase (7.2%) (p < 0.001) - tPA (6.3%) • But, need to treat 100 patients with tPA instead of streptokinase to prevent 1 death! • tPA costly - $ 250 thousand to save one death ??? Clinically significant
  105. 105. Estimation • Effect size observed in a particular study is called ‘Point estimate’ • True effect is unlikely to be exactly that observed in study because of random variation • Confidence interval (CI): usually 95% (Last) computed interval with a given probability e.g. 95%, that the true value such as a mean, proportion, or rate is contained within the interval
  106. 106. Confidence intervals (Fletcher) If the study is unbiased, there is a 95% chance that that the interval includes the true effect size. The true value is likely to be close to the point estimate, less likely to be near the outer limits of that interval, and could (5 times out of 100) fall outside these limits altogether, CI allows the reader to see the range of plausible values and so to decide whether the effect size they regard as clinically meaningful is consistent with or ruled out by the data
  107. 107. Multiple comparison problem • If a no. of comparisons are made, (e.g. in a large study, the effect of treatment assessed separately for each subgroup, and for each outcome), 1 in 20 of these comparisons is likely to be statistically significant at the 0.05 level
  108. 108. “If you dredge the data sufficiently deeply, and sufficiently often, you will find something odd. Many of these bizarre findings will be due to chance…….discoveries that were not initially postulated among the major objectives of the trial should be treated with extreme caution.”
  109. 109. Dealing with random error • Increasing the sample size: sample size depends upon - level of statistical significance (α error) - Acceptable chance of missing a real effect (β error) - Magnitude of effect under investigation - Amount of disease in population - Relative sizes of groups being compared • Sample size is usually a compromise b/w ideal and logistic and financial considerations
  110. 110. References 1. Fletcher RH et al.Clinical Epidemiology : The Essentials – 3rd ed. 2. Beaglehole R et al. Basic Epidemiology, WHO 3. Last JM. Dictionary in Epidemiology – 3rd ed. 4. Maxcy-Rosenau-Last. Public Health & Preventive Medicine – 14th ed. 5. Norell SE. Workbook of Epidemiology 6. Park K. Park’s textbook of preventive and social medicine – 16th ed.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×