SlideShare a Scribd company logo
1 of 36
Download to read offline
Causal inference is not statistical inference
Jon Williamson
Philosophy Department and Centre for Reasoning
University of Kent
8 December 2022
The statistics wars and their casualties
1 Introduction
Causal enquiry. Establishing that A is a cause of B, or assessing whether A is a cause of B.
Key techniques for causal enquiry are couched as statistical. E.g.,
Analysis of RCTs
Analyses of cohort studies and other observational studies
Meta-analysis
Learning a structural equation model
Learning a graphical causal model.
Two theses:
Weak. Statistical techniques are useful for causal enquiry.
Strong. Causal enquiry is a purely statistical problem.
There seems to be a trend from the weak thesis to the strong.
R.A. Fisher championed the use of both randomised (Fisher, 1935) and observational stud-
ies for causal enquiry.
In Statistical methods for research workers:
from the modern point of view, the study of the causes of variation of any variable
phenomenon, from the yield of wheat to the intellect of man, should be begun by
the examination and measurement of the variation which presents itself. (Fisher,
1925, p. 3.)
This is compatible with the weak thesis.
Austin Bradford Hill can be viewed as another advocate of the weak thesis.
Over a fairly wide expanse of clinical medicine in Great Britain the statistical ap-
proach has been accepted as useful (Hill, 1952, p. 113).
The statistically guided therapeutic trial is not the only means of investigation and
experiment, nor indeed is it invariably the best way of advancing knowledge of
therapeutics. I commend it to you as one way, and, I believe, a useful way (Hill,
1952, p. 119).
Hill (1965) stressed the gap between association and causation.
His 9 ‘viewpoints’ or indicators of causation help to bridge the gap.
Several of these indicators are not statistical indicators.
These days, however, the strong thesis predominates
EBM and EBP focus almost exclusively on the analysis of RCTs.
Quantitative statistical methods now dominate causal enquiry in the social sciences.
The graphical causal modelling approach conceives of causal relationships in terms of sta-
tistical models with Markovian properties (Lauritzen, 1996).
Recent developments in statistical methods are invoked to motivate the strong thesis:
When attempting to make sense of data, statisticians are invariably motivated by
causal questions. . . . The peculiar nature of these questions is that they cannot
be answered, or even articulated, in the traditional language of statistics. In fact,
only recently has science acquired a mathematical language we can use to express
such questions, with accompanying tools to allow us to answer them from data.
The development of these tools has spawned a revolution in the way causality is
treated in statistics and in many of its satellite disciplines. (Pearl et al., 2016, p. xi.)
There’s also a trend towards conflating causation with correlation conditional on potential
confounders.
EG RCTs are touted as providing unbiased estimates of ‘causal effects’.
I’ll suggest:
▶ The replication crisis results from the move from the weak to the strong thesis.
▶ Evidential Pluralism can help to mitigate the crisis.
Contents
1 Introduction
2 Evidential Pluralism
3 Evidential Pluralism in medicine
4 The replication crisis
5 Examples
Bibliography
2 Evidential Pluralism
Correlation is not Causation. A correlation between A and B might be due to:
Causation A is a cause of B.
Reverse causation B is a cause of A.
Confounding (selection bias) There is some confounder C that has not been adequately
controlled for by the study.
Performance bias Those in the A-group are identified and treated differently to
those in the ¬A-group.
Detection bias B is measured differently in the A-group in comparison to the
¬A-group.
Chance Sheer coincidence, attributable to too small a sample.
Fishing Measuring so many outcomes that there is likely to be a
chance correlation between A and some such B.
Temporal trends A and B both increase over time for independent reasons.
E.g., prevalence of coeliac disease & spread of HIV.
Semantic relationships Overlapping meaning. E.g., phthiasis, consumption, scrofula
(all of which are TB).
Constitutive relationships One variable is a component of the other.
Mereological relationships One variable is a part of the other.
Logical relationships Measurable variables A and B are logically complex and log-
ically overlapping. E.g., A is C ∧ D and B is D ∨ E.
Nomological relationships E.g., conservation of total energy can induce a correlation
between two energy measurements.
Mathematical relationships E.g., mean and variance variables from the same distribution
can be correlated.
If A is a cause of B, then there is some complex of mechanisms that:
Explains instances of B by invoking instances of A, and
Can account for the magnitude of the observed correlation.
∴ In order to establish causation one needs to establish both:
The existence of an appropriate correlation.
The existence of an appropriate mechanism that can explain that correlation.
Russo, F. and Williamson, J. (2007). Interpreting causality in the health sciences. International
Studies in the Philosophy of Science, 21(2):157–170.
This observation motivates Evidential Pluralism, a theory of causal enquiry:
A is a cause of B
There is a mechanism
linking A to B
A is correlated with B
Specific
mechanism hypotheses
Mechanistic studies
(evidence of features
of the mechanism)
Association studies
(measure A and B
together)
μ2
μ1
α2
μ3
α1
Evidential Pluralism = object pluralism + study pluralism
Association and Mechanistic studies reinforce one another
Association and mechanistic studies have complementary strengths:
Association studies can be unreliable indicators of causality because of biases etc.
Mechanistic studies help to isolate the correlations that are genuinely causal.
Mechanistic studies can suffer from the problems of masking and complexity.
Association studies help to isolate the mechanisms that make a net difference.
∴ Association studies and mechanistic studies reinforce each other.
Their combined evidential value is more than the sum of the parts.
Berthold Lubetkin’s 1934 London Zoo Penguin Pool, an early example of reinforced concrete.
3 Evidential Pluralism in medicine
Present-day EBM focusses on the α-channels and takes mechanistic studies to be strictly
inferior to association (clinical/epidemiological) studies.
Evidence-based medicine de-emphasizes intuition, unsystematic clinical experi-
ence, and pathophysiologic rationale as sufficient grounds for clinical decision
making and stresses the examination of evidence from clinical research. (Guy-
att et al., 1992, p. 2420)
Evidential Pluralism takes the μ-channels to be important too.
A is a cause of B
There is a mechanism
linking A to B
A is correlated with B
Specific
mechanism hypotheses
Mechanistic studies
(evidence of features
of the mechanism)
Association studies
(measure A and B
together)
μ2
μ1
α2
μ3
α1
In medicine, this leads to an approach called EBM+.
EBM+ systematically scrutinises mechanistic studies alongside association studies.
EBM+ handbook (open access):
EP also provides an account of external validity:
A is a cause of B
There is a mechanism
linking A to B
A is correlated with B
Specific
mechanism hypotheses
Mechanistic
studies
Association
studies
Source
studies
Similarity of
mechanisms
Williamson, J. (2019a). Establishing causal claims in medicine. International Studies in the
Philosophy of Science, 32(2):33–61.
Mechanistic evidence is routinely assessed in this way in exposure assessment:
EG IARC integrates human, animal and mechanistic studies.
The agent is
carcinogenic in humans
There is a mech. from the
agent to cancer in humans
The agent is correlated
with cancer in humans
Key characteristics of cancer
& other mech. hypotheses
Mechanistic
studies
Human
studies
Animal
studies
Similarity of
mechanisms
Exposure
characterisation
Williamson, J. (2019b). Evidential Proximity, Independence, and the evaluation of carcino-
genicity. Journal of Evaluation in Clinical Practice, 25(6):955–961.
Perhaps EBM+ would have served us better than EBM during the pandemic.
Evidential Pluralism accords with the weak thesis, not the strong thesis.
A is a cause of B
There is a mechanism
linking A to B
A is correlated with B
Specific
mechanism hypotheses
Mechanistic studies
(evidence of features
of the mechanism)
Association studies
(measure A and B
together)
μ2
μ1
α2
μ3
α1
Causal inference requires a combination of statistical and mechanistic inference.
4 The replication crisis
When does a lack of replicability lead to a crisis?
That a study fails to replicate an observed association is a common occurrence.
This is normal science.
It is not of any concern in itself.
It is much more serious when established causal relationships are overturned.
An established proposition can be used as evidence for further claims.
▶ A lack of stability in the evidence base.
▶ The need to re-evaluate claims made on the basis of revoked propositions.
A science makes progress to the extent that it establishes key propositions.
Causal propositions are key to most empirical sciences.
▶ Scepticism about the ability of a science to progress.
∴ A crisis arises when established causal relationships are often overturned.
When causal relationships are established on the basis of studies that fail to replicate.
This problem arises more regularly when:
1. We establish causation purely on the basis of association studies;
2. We make use of few association studies;
3. The association studies are of lower quality;
4. Significance levels are too high;
5. There is publication bias and other kinds of bias;
. . .
A lot of the action has centred around (4) and (5).
I would urge paying more attention to (1).
Evidential Pluralism helps to avoid (1):
Causal relationships are established by establishing mechanism as well as correlation.
We should scrutinise mechanistic studies alongside association studies.
A broader evidence base is more robust:
Biases are more easily eliminated.
It’s less likely that an established claim hinges on a study that turns out not to be
replicable.
A broad evidence base is arguably more important than successful replication.
Successful replication of a study may inspire false confidence in a causal claim.
It may be that the biases of the study are replicated alongside the association.
Hence there is a need for triangulation (see, e.g., Munafò and Davey Smith, 2018).
Evidential Pluralism offers a well-motivated route to triangulation.
5 Examples
Evidential Pluralism can indeed lead to caution when establishing causal claims.
5.1 Streptomycin and TB
(Gillies, 2019, Chapter 9.)
Streptomycin was discovered in 1944 by Schatz, Bugie and Waksman.
It inhibited Mycobacterium tuberculosis in vitro.
It successfully treated TB in guinea pigs, and was observed to help human patients.
In 1947-8 Austin Bradford Hill carried out an RCT on streptomycin to treat TB.
Treatment was given for 4 months and patients were observed for a further 2 months.
It found a large correlation: 51% of treated patients considerably improved, as opposed
to 8% of patients in the control group.
However, Hill did not take causation to be established, for mechanistic reasons:
There was evidence of streptomycin-resistant bacteria strains in several patients.
It is plausible that the patients might deteriorate after the 6-month trial end-point.
(A subsequent assessment of the patients confirmed this hypothesis: after 5 years
there was no significant improvement in the treatment group.)
This led Hill to carry out another trial:
A combination of streptomycin and para-amino-salicylic acid.
This largely avoided antibiotic resistance.
An EBM approach would have treated the first trial as much more conclusive.
This was an RCT (the first published medical RCT).
It observed a substantial correlation (’effect size’).
This study was fully replicable.
Hill’s more cautious approach is validated by EBM+ / Evidential Pluralism.
5.2 Cold fusion
(Norton, 2021, Chapter 3.)
In 1989 Martin Fleischmann and B. Stanley Pons announced that they had carried out fusion
on a lab bench at ordinary temperatures.
They electrolysed heavy water (deuterium oxide) with palladium electrodes.
The cathode became saturated with deuterium.
Large quantities of heat were produced.
They argued that the deuterium atoms were brought close enough to ignite a nuclear
fusion reaction.
IE That fusion caused the large quantities of heat.
Steven Jones then announced that he had similar results.
However, the claim that these results were attributable to cold fusion was not established.
The US Energy Research Advisory Board argued in 1989 that:
Mechanistic evidence undermined a link between the heat and a nuclear process:
There were insufficient levels of fusion products (e.g., neutrons, tritium and
isotopes of helium) for fusion to account for the heat.
Confirmed theory implied that electrostatic repulsion prevents free deuterium
nuclei from approaching closer than about 0.1 nanometres, which is too far to
initiate fusion.
The closest approach of deuterium atoms in palladium was found to be 0.17nm.
In contrast, in molecular deuterium, two deuterium atoms are 0.074nm apart.
Further attempts to replicate the experiment led to mixed results.
Replication attempts continued to lead to mixed results.
Similar effects have been observed many times, but the effects weren’t routinely achieved.
A certain amount of luck is required to find these effects.
Edmund Sturms and others claimed that the successful replications demonstrate cold fusion.
The community view remains that the observed effects have not been established to be
caused by cold fusion.
This is validated by Evidential Pluralism.
5.3 The discovery of Helicobacter pylori
(Norton, 2021, Chapter 3.)
In the early 1980s, Barry Marshall and Robin Warren found an association between the pres-
ence of Helicobacter pylori and gastritis and ulcers.
They submitted their paper to the Lancet in 1984 but reviewers didn’t believe their findings.
There was evidence that bacteria could not survive in such an acidic environment.
This negative mechanistic evidence undermined the association study.
∴ Their hypothesis was initially not considered established.
But they contacted Skirrow who managed to replicate their findings.
Only then did the Lancet publish.
By the late 1980s, there were:
Many more association studies that replicated the findings,
Mechanistic studies that found that H. pylori can survive in an acidic environment.
At this stage the hypothesis was generally regarded as established.
In the 1990s H. pylori was found to require an acidic environment (Clyne et al., 1995), and
the mechanisms behind its survival are now better understood (Ansari and Yamaoka, 2017).
In 2005 Marshall and Warren won the Nobel prize for medicine for their work.
5.4 The Miller Experiment
(Norton, 2021, Chapter 3.)
The Michelson-Morley experiment of 1887 failed to detect an ether wind.
NB Ether was the supposed medium that carries light, electric fields and magnetic fields.
It should flow past the earth as it moves, if it exists. This is the ether wind.
In 1925, Dayton C. Miller reported that an experiment found the ether wind.
This represented a failure of replication.
NB Miller was president of the American Physical Society.
His experimental apparatus was one of the most sensitive.
Einstein attributed Miller’s result to experimental error related to temperature.
Miller’s assumption that the velocity of light is dependent on height above sea level is
theoretically implausible.
Another similar replication, the Trouton-Noble experiment, found no evidence of an
ether wind.
In addition, Miller measured a flow of 10km/sec, but one would expect the motion of the
earth would yield an ether wind flow of about 30km/sec.
Evidential Pluralism would condone scepticism here.
In 1955, Shankland et al. reanalysed Miller’s results and found that they were associated
with temperature variations in the apparatus.
This analysis confirmed Einstein’s doubts.
5.5 Intercessionary prayer
(Norton, 2021, Chapter 3.)
There have been several studies of the effectiveness of intercessionary prayer.
EG Leibovici (2001) conducted an RCT that found an association between remote, retroac-
tive intercessionary prayer (in 2000) and length of stay of patients with blood infections
in hospital (in 1990-6).
Byrd (1988) and Harris et al. (1999) also found positive associations between prayer
and recovery of cardiac patients.
In general, however, association studies have yielded mixed results.
On the other hand, a systematic review of 10 studies found ‘Specific complications
(cardiac arrest, major surgery before discharge, need for a monitoring catheter in the
heart) were significantly more likely to occur among those in the group not receiving
prayer’ (Roberts et al., 2009, p. 2).
Again, this is a case where well-confirmed scientific theory provides plenty of evidence
against a mechanism.
Particularly in the case of retroactive prayer.
∴ Evidential Pluralism would not take the effectiveness of prayer to be established.
NB This accords well with Leibovici’s own conclusions.
Summary
It used to be a platitude that causation is not correlation.
This is apparently no longer the case, because:
The correlations and statistical techniques are getting more sophisticated.
There are more accounts that conflate causality with some kind of correlation.
This trend is pernicious:
It leads to a simplistic account of causal enquiry.
It is turning the normal phenomenon of replication failure into a crisis.
Evidential Pluralism provides a more comprehensive account of causal enquiry.
A broader evidence base is more robust in the face of potential replication failures.
Bibliography
Ansari, S. and Yamaoka, Y
. (2017). Survival of Helicobacter pylori in gastric acidic territory.
Helicobacter, 22(e12386).
Clyne, M., Labigne, A., and Drumm, B. (1995). Helicobacter pylori requires an acidic envi-
ronment to survive in the presence of urea. Infection and Immunity, 63(5):1669–1673.
Fisher, R. (1935). The design of experiments. Oliver and Boyd, Edinburgh.
Fisher, R. A. (1925). Statistical methods for research workers. Hafner, New York, thirteenth
(1963) edition.
Gillies, D. (2019). Should we distrust medical interventions? Metascience, 28(2):273–276.
Guyatt, G., Cairns, J., Churchill, D., Cook, D., Haynes, B., Hirsh, J., Irvine, J., Levine, M.,
Levine, M., Nishikawa, J., Sackett, D., Brill-Edwards, P., Gerstein, H., Gibson, J., Jaeschke,
R., Kerigan, A., Neville, A., Panju, A., Detsky, A., Enkin, M., Frid, P., Gerrity, M., Laupacis,
A., Lawrence, V., Menard, J., Moyer, V., Mulrow, C., Links, P., Oxman, A., Sinclair, J., and
Tugwell, P. (1992). Evidence-based medicine: A new approach to teaching the practice of
medicine. JAMA, 268(17):2420–2425.
Hill, A. B. (1952). The clinical trial. New England Journal of Medicine, 247(4):113–119.
Hill, A. B. (1965). The environment and disease: association or causation? Proceedings of
the Royal Society of Medicine, 58:295–300.
Lauritzen, S. L. (1996). Graphical models. Clarendon Press, Oxford.
Leibovici, L. (2001). Effects of remote, retroactive intercessory prayer on outcomes in pa-
tients with bloodstream infection: randomised controlled trial. British Medical Journal,
323:1450–1451.
Munafò, M. R. and Davey Smith, G. (2018). Robust research needs many lines of evidence.
Nature, 553(7689):399–401.
Norton, J. D. (2021). The Material Theory of Induction. University of Calgary Press, Calgary.
Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. Wiley,
Chichester.
Roberts, L., Ahmed, I., Hall, S., and Davison, A. (2009). Intercessory prayer for the alleviation
of ill health. The Cochrane database of systematic reviews, 2009(2):CD000368.
Russo, F. and Williamson, J. (2007). Interpreting causality in the health sciences. International
Studies in the Philosophy of Science, 21(2):157–170.
Williamson, J. (2019a). Establishing causal claims in medicine. International Studies in the
Philosophy of Science, 32(2):33–61.
Williamson, J. (2019b). Evidential Proximity, Independence, and the evaluation of carcino-
genicity. Journal of Evaluation in Clinical Practice, 25(6):955–961.

More Related Content

Similar to Causal inference is not statistical inference

Clinical Study Design and Methods Terminology
Clinical Study Design and Methods TerminologyClinical Study Design and Methods Terminology
Clinical Study Design and Methods TerminologyMadhukarSureshThagna
 
weon preconference 2013 vandenbroucke counterfactual theory causality epidem...
 weon preconference 2013 vandenbroucke counterfactual theory causality epidem... weon preconference 2013 vandenbroucke counterfactual theory causality epidem...
weon preconference 2013 vandenbroucke counterfactual theory causality epidem...Bsie
 

Similar to Causal inference is not statistical inference (20)

Scientific problems and philosophical questions about causality. Why we need ...
Scientific problems and philosophical questions about causality. Why we need ...Scientific problems and philosophical questions about causality. Why we need ...
Scientific problems and philosophical questions about causality. Why we need ...
 
Clinical Study Design and Methods Terminology
Clinical Study Design and Methods TerminologyClinical Study Design and Methods Terminology
Clinical Study Design and Methods Terminology
 
STDEV . I3.pdf
STDEV . I3.pdfSTDEV . I3.pdf
STDEV . I3.pdf
 
Russo Madrid Medicine Oct07
Russo Madrid Medicine Oct07Russo Madrid Medicine Oct07
Russo Madrid Medicine Oct07
 
Meta analisis & criminology
Meta analisis & criminologyMeta analisis & criminology
Meta analisis & criminology
 
Belgian Society May05
Belgian Society May05Belgian Society May05
Belgian Society May05
 
Belgian Society May05
Belgian Society May05Belgian Society May05
Belgian Society May05
 
Russo Variation Epidemiology
Russo Variation EpidemiologyRusso Variation Epidemiology
Russo Variation Epidemiology
 
Evidence in the social sciences - Series of lectures on causal modelling in t...
Evidence in the social sciences - Series of lectures on causal modelling in t...Evidence in the social sciences - Series of lectures on causal modelling in t...
Evidence in the social sciences - Series of lectures on causal modelling in t...
 
Causality Triangle Presentation
Causality Triangle PresentationCausality Triangle Presentation
Causality Triangle Presentation
 
Causal models and evidential pluralism
Causal models and evidential pluralismCausal models and evidential pluralism
Causal models and evidential pluralism
 
Venezia management
Venezia managementVenezia management
Venezia management
 
Statistics basics
Statistics basicsStatistics basics
Statistics basics
 
The mosaic of causal theory
The mosaic of causal theoryThe mosaic of causal theory
The mosaic of causal theory
 
Russo Whatif Presentation
Russo Whatif PresentationRusso Whatif Presentation
Russo Whatif Presentation
 
Causal pluralism and public health
Causal pluralism and public healthCausal pluralism and public health
Causal pluralism and public health
 
weon preconference 2013 vandenbroucke counterfactual theory causality epidem...
 weon preconference 2013 vandenbroucke counterfactual theory causality epidem... weon preconference 2013 vandenbroucke counterfactual theory causality epidem...
weon preconference 2013 vandenbroucke counterfactual theory causality epidem...
 
Causal pluralism and medical diagnosis
Causal pluralism and medical diagnosisCausal pluralism and medical diagnosis
Causal pluralism and medical diagnosis
 
Empirical Generalisations Kent Nov07
Empirical Generalisations Kent Nov07Empirical Generalisations Kent Nov07
Empirical Generalisations Kent Nov07
 
Dr. Obumneke Amadi-Onuoha Scripts- 7_project_rsch
Dr.  Obumneke Amadi-Onuoha Scripts- 7_project_rschDr.  Obumneke Amadi-Onuoha Scripts- 7_project_rsch
Dr. Obumneke Amadi-Onuoha Scripts- 7_project_rsch
 

More from jemille6

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”jemille6
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilismjemille6
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfjemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfjemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022jemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?jemille6
 
What's the question?
What's the question? What's the question?
What's the question? jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metasciencejemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Twojemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testingjemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredgingjemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probabilityjemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...jemille6
 

More from jemille6 (20)

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
 

Recently uploaded

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 

Causal inference is not statistical inference

  • 1. Causal inference is not statistical inference Jon Williamson Philosophy Department and Centre for Reasoning University of Kent 8 December 2022 The statistics wars and their casualties
  • 2. 1 Introduction Causal enquiry. Establishing that A is a cause of B, or assessing whether A is a cause of B. Key techniques for causal enquiry are couched as statistical. E.g., Analysis of RCTs Analyses of cohort studies and other observational studies Meta-analysis Learning a structural equation model Learning a graphical causal model. Two theses: Weak. Statistical techniques are useful for causal enquiry. Strong. Causal enquiry is a purely statistical problem. There seems to be a trend from the weak thesis to the strong.
  • 3. R.A. Fisher championed the use of both randomised (Fisher, 1935) and observational stud- ies for causal enquiry. In Statistical methods for research workers: from the modern point of view, the study of the causes of variation of any variable phenomenon, from the yield of wheat to the intellect of man, should be begun by the examination and measurement of the variation which presents itself. (Fisher, 1925, p. 3.) This is compatible with the weak thesis.
  • 4. Austin Bradford Hill can be viewed as another advocate of the weak thesis. Over a fairly wide expanse of clinical medicine in Great Britain the statistical ap- proach has been accepted as useful (Hill, 1952, p. 113). The statistically guided therapeutic trial is not the only means of investigation and experiment, nor indeed is it invariably the best way of advancing knowledge of therapeutics. I commend it to you as one way, and, I believe, a useful way (Hill, 1952, p. 119). Hill (1965) stressed the gap between association and causation. His 9 ‘viewpoints’ or indicators of causation help to bridge the gap. Several of these indicators are not statistical indicators.
  • 5. These days, however, the strong thesis predominates EBM and EBP focus almost exclusively on the analysis of RCTs. Quantitative statistical methods now dominate causal enquiry in the social sciences. The graphical causal modelling approach conceives of causal relationships in terms of sta- tistical models with Markovian properties (Lauritzen, 1996). Recent developments in statistical methods are invoked to motivate the strong thesis: When attempting to make sense of data, statisticians are invariably motivated by causal questions. . . . The peculiar nature of these questions is that they cannot be answered, or even articulated, in the traditional language of statistics. In fact, only recently has science acquired a mathematical language we can use to express such questions, with accompanying tools to allow us to answer them from data. The development of these tools has spawned a revolution in the way causality is treated in statistics and in many of its satellite disciplines. (Pearl et al., 2016, p. xi.) There’s also a trend towards conflating causation with correlation conditional on potential confounders. EG RCTs are touted as providing unbiased estimates of ‘causal effects’.
  • 6. I’ll suggest: ▶ The replication crisis results from the move from the weak to the strong thesis. ▶ Evidential Pluralism can help to mitigate the crisis. Contents 1 Introduction 2 Evidential Pluralism 3 Evidential Pluralism in medicine 4 The replication crisis 5 Examples Bibliography
  • 8. Correlation is not Causation. A correlation between A and B might be due to: Causation A is a cause of B. Reverse causation B is a cause of A. Confounding (selection bias) There is some confounder C that has not been adequately controlled for by the study. Performance bias Those in the A-group are identified and treated differently to those in the ¬A-group. Detection bias B is measured differently in the A-group in comparison to the ¬A-group. Chance Sheer coincidence, attributable to too small a sample. Fishing Measuring so many outcomes that there is likely to be a chance correlation between A and some such B. Temporal trends A and B both increase over time for independent reasons. E.g., prevalence of coeliac disease & spread of HIV. Semantic relationships Overlapping meaning. E.g., phthiasis, consumption, scrofula (all of which are TB). Constitutive relationships One variable is a component of the other. Mereological relationships One variable is a part of the other. Logical relationships Measurable variables A and B are logically complex and log- ically overlapping. E.g., A is C ∧ D and B is D ∨ E. Nomological relationships E.g., conservation of total energy can induce a correlation between two energy measurements. Mathematical relationships E.g., mean and variance variables from the same distribution can be correlated.
  • 9. If A is a cause of B, then there is some complex of mechanisms that: Explains instances of B by invoking instances of A, and Can account for the magnitude of the observed correlation. ∴ In order to establish causation one needs to establish both: The existence of an appropriate correlation. The existence of an appropriate mechanism that can explain that correlation. Russo, F. and Williamson, J. (2007). Interpreting causality in the health sciences. International Studies in the Philosophy of Science, 21(2):157–170.
  • 10. This observation motivates Evidential Pluralism, a theory of causal enquiry: A is a cause of B There is a mechanism linking A to B A is correlated with B Specific mechanism hypotheses Mechanistic studies (evidence of features of the mechanism) Association studies (measure A and B together) μ2 μ1 α2 μ3 α1 Evidential Pluralism = object pluralism + study pluralism
  • 11. Association and Mechanistic studies reinforce one another Association and mechanistic studies have complementary strengths: Association studies can be unreliable indicators of causality because of biases etc. Mechanistic studies help to isolate the correlations that are genuinely causal. Mechanistic studies can suffer from the problems of masking and complexity. Association studies help to isolate the mechanisms that make a net difference. ∴ Association studies and mechanistic studies reinforce each other. Their combined evidential value is more than the sum of the parts.
  • 12. Berthold Lubetkin’s 1934 London Zoo Penguin Pool, an early example of reinforced concrete.
  • 13. 3 Evidential Pluralism in medicine
  • 14. Present-day EBM focusses on the α-channels and takes mechanistic studies to be strictly inferior to association (clinical/epidemiological) studies. Evidence-based medicine de-emphasizes intuition, unsystematic clinical experi- ence, and pathophysiologic rationale as sufficient grounds for clinical decision making and stresses the examination of evidence from clinical research. (Guy- att et al., 1992, p. 2420)
  • 15. Evidential Pluralism takes the μ-channels to be important too. A is a cause of B There is a mechanism linking A to B A is correlated with B Specific mechanism hypotheses Mechanistic studies (evidence of features of the mechanism) Association studies (measure A and B together) μ2 μ1 α2 μ3 α1 In medicine, this leads to an approach called EBM+. EBM+ systematically scrutinises mechanistic studies alongside association studies.
  • 17. EP also provides an account of external validity: A is a cause of B There is a mechanism linking A to B A is correlated with B Specific mechanism hypotheses Mechanistic studies Association studies Source studies Similarity of mechanisms Williamson, J. (2019a). Establishing causal claims in medicine. International Studies in the Philosophy of Science, 32(2):33–61.
  • 18. Mechanistic evidence is routinely assessed in this way in exposure assessment: EG IARC integrates human, animal and mechanistic studies. The agent is carcinogenic in humans There is a mech. from the agent to cancer in humans The agent is correlated with cancer in humans Key characteristics of cancer & other mech. hypotheses Mechanistic studies Human studies Animal studies Similarity of mechanisms Exposure characterisation Williamson, J. (2019b). Evidential Proximity, Independence, and the evaluation of carcino- genicity. Journal of Evaluation in Clinical Practice, 25(6):955–961.
  • 19. Perhaps EBM+ would have served us better than EBM during the pandemic.
  • 20. Evidential Pluralism accords with the weak thesis, not the strong thesis. A is a cause of B There is a mechanism linking A to B A is correlated with B Specific mechanism hypotheses Mechanistic studies (evidence of features of the mechanism) Association studies (measure A and B together) μ2 μ1 α2 μ3 α1 Causal inference requires a combination of statistical and mechanistic inference.
  • 22. When does a lack of replicability lead to a crisis? That a study fails to replicate an observed association is a common occurrence. This is normal science. It is not of any concern in itself. It is much more serious when established causal relationships are overturned. An established proposition can be used as evidence for further claims. ▶ A lack of stability in the evidence base. ▶ The need to re-evaluate claims made on the basis of revoked propositions. A science makes progress to the extent that it establishes key propositions. Causal propositions are key to most empirical sciences. ▶ Scepticism about the ability of a science to progress. ∴ A crisis arises when established causal relationships are often overturned. When causal relationships are established on the basis of studies that fail to replicate.
  • 23. This problem arises more regularly when: 1. We establish causation purely on the basis of association studies; 2. We make use of few association studies; 3. The association studies are of lower quality; 4. Significance levels are too high; 5. There is publication bias and other kinds of bias; . . . A lot of the action has centred around (4) and (5). I would urge paying more attention to (1). Evidential Pluralism helps to avoid (1): Causal relationships are established by establishing mechanism as well as correlation. We should scrutinise mechanistic studies alongside association studies.
  • 24. A broader evidence base is more robust: Biases are more easily eliminated. It’s less likely that an established claim hinges on a study that turns out not to be replicable. A broad evidence base is arguably more important than successful replication. Successful replication of a study may inspire false confidence in a causal claim. It may be that the biases of the study are replicated alongside the association. Hence there is a need for triangulation (see, e.g., Munafò and Davey Smith, 2018). Evidential Pluralism offers a well-motivated route to triangulation.
  • 25. 5 Examples Evidential Pluralism can indeed lead to caution when establishing causal claims.
  • 26. 5.1 Streptomycin and TB (Gillies, 2019, Chapter 9.) Streptomycin was discovered in 1944 by Schatz, Bugie and Waksman. It inhibited Mycobacterium tuberculosis in vitro. It successfully treated TB in guinea pigs, and was observed to help human patients. In 1947-8 Austin Bradford Hill carried out an RCT on streptomycin to treat TB. Treatment was given for 4 months and patients were observed for a further 2 months. It found a large correlation: 51% of treated patients considerably improved, as opposed to 8% of patients in the control group. However, Hill did not take causation to be established, for mechanistic reasons: There was evidence of streptomycin-resistant bacteria strains in several patients. It is plausible that the patients might deteriorate after the 6-month trial end-point. (A subsequent assessment of the patients confirmed this hypothesis: after 5 years there was no significant improvement in the treatment group.)
  • 27. This led Hill to carry out another trial: A combination of streptomycin and para-amino-salicylic acid. This largely avoided antibiotic resistance. An EBM approach would have treated the first trial as much more conclusive. This was an RCT (the first published medical RCT). It observed a substantial correlation (’effect size’). This study was fully replicable. Hill’s more cautious approach is validated by EBM+ / Evidential Pluralism.
  • 28. 5.2 Cold fusion (Norton, 2021, Chapter 3.) In 1989 Martin Fleischmann and B. Stanley Pons announced that they had carried out fusion on a lab bench at ordinary temperatures. They electrolysed heavy water (deuterium oxide) with palladium electrodes. The cathode became saturated with deuterium. Large quantities of heat were produced. They argued that the deuterium atoms were brought close enough to ignite a nuclear fusion reaction. IE That fusion caused the large quantities of heat. Steven Jones then announced that he had similar results.
  • 29. However, the claim that these results were attributable to cold fusion was not established. The US Energy Research Advisory Board argued in 1989 that: Mechanistic evidence undermined a link between the heat and a nuclear process: There were insufficient levels of fusion products (e.g., neutrons, tritium and isotopes of helium) for fusion to account for the heat. Confirmed theory implied that electrostatic repulsion prevents free deuterium nuclei from approaching closer than about 0.1 nanometres, which is too far to initiate fusion. The closest approach of deuterium atoms in palladium was found to be 0.17nm. In contrast, in molecular deuterium, two deuterium atoms are 0.074nm apart. Further attempts to replicate the experiment led to mixed results. Replication attempts continued to lead to mixed results. Similar effects have been observed many times, but the effects weren’t routinely achieved. A certain amount of luck is required to find these effects. Edmund Sturms and others claimed that the successful replications demonstrate cold fusion. The community view remains that the observed effects have not been established to be caused by cold fusion. This is validated by Evidential Pluralism.
  • 30. 5.3 The discovery of Helicobacter pylori (Norton, 2021, Chapter 3.) In the early 1980s, Barry Marshall and Robin Warren found an association between the pres- ence of Helicobacter pylori and gastritis and ulcers. They submitted their paper to the Lancet in 1984 but reviewers didn’t believe their findings. There was evidence that bacteria could not survive in such an acidic environment. This negative mechanistic evidence undermined the association study. ∴ Their hypothesis was initially not considered established. But they contacted Skirrow who managed to replicate their findings. Only then did the Lancet publish. By the late 1980s, there were: Many more association studies that replicated the findings, Mechanistic studies that found that H. pylori can survive in an acidic environment. At this stage the hypothesis was generally regarded as established. In the 1990s H. pylori was found to require an acidic environment (Clyne et al., 1995), and the mechanisms behind its survival are now better understood (Ansari and Yamaoka, 2017). In 2005 Marshall and Warren won the Nobel prize for medicine for their work.
  • 31. 5.4 The Miller Experiment (Norton, 2021, Chapter 3.) The Michelson-Morley experiment of 1887 failed to detect an ether wind. NB Ether was the supposed medium that carries light, electric fields and magnetic fields. It should flow past the earth as it moves, if it exists. This is the ether wind. In 1925, Dayton C. Miller reported that an experiment found the ether wind. This represented a failure of replication. NB Miller was president of the American Physical Society. His experimental apparatus was one of the most sensitive.
  • 32. Einstein attributed Miller’s result to experimental error related to temperature. Miller’s assumption that the velocity of light is dependent on height above sea level is theoretically implausible. Another similar replication, the Trouton-Noble experiment, found no evidence of an ether wind. In addition, Miller measured a flow of 10km/sec, but one would expect the motion of the earth would yield an ether wind flow of about 30km/sec. Evidential Pluralism would condone scepticism here. In 1955, Shankland et al. reanalysed Miller’s results and found that they were associated with temperature variations in the apparatus. This analysis confirmed Einstein’s doubts.
  • 33. 5.5 Intercessionary prayer (Norton, 2021, Chapter 3.) There have been several studies of the effectiveness of intercessionary prayer. EG Leibovici (2001) conducted an RCT that found an association between remote, retroac- tive intercessionary prayer (in 2000) and length of stay of patients with blood infections in hospital (in 1990-6). Byrd (1988) and Harris et al. (1999) also found positive associations between prayer and recovery of cardiac patients. In general, however, association studies have yielded mixed results. On the other hand, a systematic review of 10 studies found ‘Specific complications (cardiac arrest, major surgery before discharge, need for a monitoring catheter in the heart) were significantly more likely to occur among those in the group not receiving prayer’ (Roberts et al., 2009, p. 2). Again, this is a case where well-confirmed scientific theory provides plenty of evidence against a mechanism. Particularly in the case of retroactive prayer. ∴ Evidential Pluralism would not take the effectiveness of prayer to be established. NB This accords well with Leibovici’s own conclusions.
  • 34. Summary It used to be a platitude that causation is not correlation. This is apparently no longer the case, because: The correlations and statistical techniques are getting more sophisticated. There are more accounts that conflate causality with some kind of correlation. This trend is pernicious: It leads to a simplistic account of causal enquiry. It is turning the normal phenomenon of replication failure into a crisis. Evidential Pluralism provides a more comprehensive account of causal enquiry. A broader evidence base is more robust in the face of potential replication failures.
  • 35. Bibliography Ansari, S. and Yamaoka, Y . (2017). Survival of Helicobacter pylori in gastric acidic territory. Helicobacter, 22(e12386). Clyne, M., Labigne, A., and Drumm, B. (1995). Helicobacter pylori requires an acidic envi- ronment to survive in the presence of urea. Infection and Immunity, 63(5):1669–1673. Fisher, R. (1935). The design of experiments. Oliver and Boyd, Edinburgh. Fisher, R. A. (1925). Statistical methods for research workers. Hafner, New York, thirteenth (1963) edition. Gillies, D. (2019). Should we distrust medical interventions? Metascience, 28(2):273–276. Guyatt, G., Cairns, J., Churchill, D., Cook, D., Haynes, B., Hirsh, J., Irvine, J., Levine, M., Levine, M., Nishikawa, J., Sackett, D., Brill-Edwards, P., Gerstein, H., Gibson, J., Jaeschke, R., Kerigan, A., Neville, A., Panju, A., Detsky, A., Enkin, M., Frid, P., Gerrity, M., Laupacis, A., Lawrence, V., Menard, J., Moyer, V., Mulrow, C., Links, P., Oxman, A., Sinclair, J., and Tugwell, P. (1992). Evidence-based medicine: A new approach to teaching the practice of medicine. JAMA, 268(17):2420–2425. Hill, A. B. (1952). The clinical trial. New England Journal of Medicine, 247(4):113–119. Hill, A. B. (1965). The environment and disease: association or causation? Proceedings of
  • 36. the Royal Society of Medicine, 58:295–300. Lauritzen, S. L. (1996). Graphical models. Clarendon Press, Oxford. Leibovici, L. (2001). Effects of remote, retroactive intercessory prayer on outcomes in pa- tients with bloodstream infection: randomised controlled trial. British Medical Journal, 323:1450–1451. Munafò, M. R. and Davey Smith, G. (2018). Robust research needs many lines of evidence. Nature, 553(7689):399–401. Norton, J. D. (2021). The Material Theory of Induction. University of Calgary Press, Calgary. Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. Wiley, Chichester. Roberts, L., Ahmed, I., Hall, S., and Davison, A. (2009). Intercessory prayer for the alleviation of ill health. The Cochrane database of systematic reviews, 2009(2):CD000368. Russo, F. and Williamson, J. (2007). Interpreting causality in the health sciences. International Studies in the Philosophy of Science, 21(2):157–170. Williamson, J. (2019a). Establishing causal claims in medicine. International Studies in the Philosophy of Science, 32(2):33–61. Williamson, J. (2019b). Evidential Proximity, Independence, and the evaluation of carcino- genicity. Journal of Evaluation in Clinical Practice, 25(6):955–961.