SlideShare a Scribd company logo
1 of 38
Download to read offline
11/5 
1 
Statistical Flukes, the Higgs Discovery, 
and 5 Sigma 
Deborah G. Mayo 
Virginia Tech 
(I) “5 sigma observed effect”. 
One of the biggest science events of 2012-13 was the 
announcement on July 4, 2012 of evidence for the discovery of 
a Higgs particle based on a “5 sigma observed effect”. 
With the March 2013 data analysis, the 5 sigma difference 
grew to 7 sigmas.
11/5 
2 
• Because the 5 sigma report refers to frequentist statistical 
tests, the discovery was immediately imbued with 
controversies from philosophy of statistics 
• I’m an outsider to high energy physics, HEP, but (aside from 
finding it fascinating), any philosopher of statistics worth her 
salt should be able to illuminate some of the more public 
controversies e.g., P-values. 
Not difficult to do, fortunately.
11/5 
(II) Bad Science? (O’Hagan, prompted by Lindley) 
To the ISBA: “Dear Bayesians: We’ve heard a lot about the 
Higgs boson. ...Specifically, the news referred to a 
confidence interval with 5-sigma limits.… Five standard 
deviations, assuming normality, means a p-value of around 
0.0000005… 
Why such an extreme evidence requirement? We know from 
a Bayesian perspective that this only makes sense if (a) the 
existence of the Higgs boson has extremely small prior 
probability and/or (b) the consequences of erroneously 
announcing its discovery are dire in the extreme. … 
…. Are the particle physics community completely wedded 
to frequentist analysis? If so, has anyone tried to explain 
what bad science that is?” 
3
11/5 
4 
Not bad science at all! 
• HEP physicists are sophisticated with their statistical 
methodology: they’d seen too many bumps disappear. 
• They want to ensure that before announcing the 
hypothesis H*: “a new particle has been discovered” 
that: 
H* has been given a severe run for its money. 
Significance tests and cognate methods (confidence 
intervals) are methods of choice here for good reason
11/5 
5 
(III) Simple statistical significance test: ingredients 
(i) Null or test hypothesis: in terms of an unknown parameter 
μ in a statistical model, an idealized representation of 
underlying data generation: a model of the detector 
μ is the “global signal strength” parameter 
H0: μ = 0 i.e., zero signal (background only hypothesis) 
Η0: μ = 0 vs. Η1: μ > 0 
μ = 1: Standard Model (SM) Higgs boson signal in addition to 
the background
11/5 
6 
Empirical data are modeled as observed values of a sample X 
(random variable); here numbers of events of a type. 
(ii). Test statistic or distance statistic: d(X)—the larger its 
value the more inconsistent the data are with Η0 in the direction 
of alternatives or discrepancies of interest. 
d(X): how many excess events of a given type are observed 
(from trillions of collisions) in comparison to what would be 
expected from background alone (in the form of bumps). 
d(X) has a known probability distribution under Η0 (and under 
various alternatives).
11/5 
(iii). The P-value (or significance level) associated with d(x0) 
is the probability of a difference as large or larger than d(x0), 
under the assumption that H0 is true: 
7 
P-value=Pr(d(X) > d(x0); H0) 
If the P-value is sufficiently small (e.g., .05, .01, .001) 
d(x0) is said to be statistically significant (or significant at the 
level reached) 
d(X) can be given in terms of standard deviation units, or 
sigma units
11/5 
8 
The distribution of statistic d(X) is the sampling distribution 
Pr(d(X) > 1; H0) = .16 
Pr(d(X) > 2; H0) = .02 
Pr(d(X) > 3; H0) =.001 
Pr(d(X) > 4; H0) = .00003 
Pr(d(X) > 5; H0)= .0000003 
The probability of observing results as or more extreme as 5 
sigmas, under H0, is approximately 1 in 3,500,000.
11/5 
9 
Normal distribution
11/5 
The actual computations are based on simulating what it would 
be like were Η0: μ = 0 (signal strength = 0), fortified with much 
cross-checking of results. 
So the significance test has: 
1) Data x0 and hypotheses Η0: μ = 0 vs. Η1: μ > 0 
2) A (distance) test statistic d(X) 
3) Probability distribution of d(X) under the null and various 
10 
alternatives
11/5 
11 
There’s generally a rule of interpretation: 
• if d(X) > 5 sigma, infer discovery 
• if d(X) > 2 sigma, get more data 
We want methods with high capability to detect discrepancies 
while avoiding mistaking spurious bumps as real.
11/5 
12 
• First stage: test for a real effect 
(Cox’s taxonomy: searching for structure) 
Not a point against point test! 
Cousins: H0 is Standard Model (SM) missing a piece 
• Second stage: determine its properties, test SM vs “Beyond 
SM” (BSM) 
(Cox: embedded)
11/5 
13 
(IV) The P-Value Police 
When the July 2012 report came out, a number of people set 
out to grade the different interpretations of the P-value report: 
Larry Wasserman (“Normal Deviate” on his blog) called them 
the “P-Value Police”. 
• Job: to examine if reports by journalists and scientists could 
by any stretch of the imagination be seen to have 
misinterpreted the sigma levels as posterior probability 
assignments to the various models and claims. 
David Spiegelhalter: A well-known (Bayesian) statistician: risk 
communication.
11/5 
14 
Thumbs up or down 
Thumbs up, to the ATLAS group report: 
“A statistical combination of these channels and 
others puts the significance of the signal at 5 
sigma, meaning that only one experiment in 
three million would see an apparent signal this 
strong in a universe without a Higgs.” 
Thumbs down to reports such as: 
“There is less than a one in 3.5 million chance that their 
results are a statistical fluke.” 
Critics (Spiegelhalter) allege they are misinterpreting the P-value 
as a posterior probability on H0.
11/5 
15 
Not so. 
H0 does not say the observed results are due to background 
alone, or are flukes, 
Η0: μ = 0 
Although if H0 were true it follows that various results would 
occur with specified probabilities. 
(In particular, it entails that large bumps are improbable.)
11/5 
In fact it is an ordinary error probability. 
Since it’s not just a single result, but a dynamic test procedure, 
we can write it: 
16 
(1) Pr(Test T produces d(X) > 5; H0) ≤ .0000003 
Note: (1) is not a conditional probability (that involves a prior) 
Pr(Test T produces d(X) > 5 and H0)/ Pr(H0)
11/5 
17 
(V) Detaching inference(s) from the evidence 
True, the inference actually detached goes beyond a P-value 
report. Infer: 
(2)There is strong evidence for 
(first) a genuine discrepancy from H0 
(later) H*: a Higgs (or a Higgs-like) particle. 
Gradations: indication, evidence, discovery (up to July 4, 2012) 
Inferring (2) relies on an implicit principle of evidence.
11/5 
Test Principle #1: (statistical significance) Data provide 
evidence for a genuine discrepancy from H0 (just) to the 
extent that H0 would (very probably) have survived, were 
H0 a reasonably adequate description of the process 
generating the data. 
(1)’ Pr(Test T produces d(X) < 5; H0) > .9999997 
• With probability .9999997, the bumps would be smaller, 
would behave like flukes, disappear with more data, not be 
produced at both CMS and ATLAS, in a world given by H0. 
• They didn’t disappear, they grew 
(2) So, H*: a Higgs (or a Higgs-like) particle. 
18
11/5 
19 
Following the rule: Interpret 5 sigma bumps as a real effect (a 
discrepancy from 0), you’d erroneously interpret data with 
probability less than .0000003 
An error probability 
The warrant isn’t low long-run error (in a case like this) but 
detaching an inference based on “strong argument from 
coincidence”. 
Qualifying claims by how well they have been probed 
(precision, accuracy).
11/5 
Second Stage 
Once the null is rejected, the job shifts to testing if various 
parameters agree with the SM predictions. 
Now the corresponding null hypothesis is the SM Higgs boson 
The null hypothesis at the second stage 
20 
H[2] 
0: SM Higgs boson: μ = 1 
and discrepancies from it are probed, estimated with 
confidence intervals 
(Cousins)
11/5 
21 
Takes us to the most important role served by statistical 
significance tests: (requiring a 5 sigma excess for discovery): 
It affords a standard for: 
• (a) denying sufficient evidence of a new particle, inferring 
“not a genuine effect”, and 
• (b) ruling out values of various parameters, e.g., mass 
ranges.
11/5 
22 
(VI) Positive and Negative test results of the analysis 
Positive (very low P-value): infer genuine effects 
Negative (moderate P-value): deny real effects (infer flukes), 
Deny excesses indicate BSM. 
• At 
both 
stages, 
they 
were 
engaged 
in 
exploration 
for 
BSM 
physics 
(beyond 
the 
standard 
model) 
• It 
combined 
testing, 
estimating, 
exploring.
11/5 
23 
NYT: “Chasing the Higgs” [Dennis Overbye interviews 
spokespeople Gianotti (ATLAS) and Tonelli (CMS).] 
• Once a month they got bumps that were random flukes 
“So ‘we crosscheck everything’ and ‘try to kill’ any 
anomaly that might be merely random.” 
They were convinced they had found evidence of extra 
dimensions of space time “and then the signal faded like an 
old tied balloon.”
11/5 
24 
• “We’ve made many discoveries,” Dr. Tonelli said, 
“most of them false.” 
• “Ninety-nine percent of the time, that is just 
what happens.” 
What’s the difference between HEP physics and social 
psychology (and other big data screening) where “most 
results in most fields are false”, or so we keep hearing? 
HEP physicists don’t publish on the basis of a single “nominal” 
(or “local”) P-value.
11/5 
25 
Look Elsewhere Effect (LEE) 
A nominal (or local) P-value: the P-value at a particular, data-determined, 
mass. 
But the probability of so impressive a difference anywhere in a 
mass range would be greater than the local one. 
I take it that requiring a smaller P-value (i.e., bigger 
difference), at least 5 sigma, is akin to adjusting for multiple 
trials or look elsewhere effect LEE.
11/5 
26 
“Game of Bump-Hunting” (Overbye) 
“One bump on physicists’ charts…was disappearing. But 
another was blooming like the shy girl at a dance. …. nobody 
could remember exactly when she had come in. But she was 
the one who would marry the prince.” 
“It continued to grow over the fall until it had reached the 3- 
sigma level — the chances of being a fluke [spurious 
significance] were less than 1 in 740, enough for physicists to 
admit it to the realm of “evidence” of something, but not yet a 
discovery.”
11/5 
Background knowledge of how flukes behave: 
• “If they were flukes, more data would make them fade into 
the statistical background, 
• If not, the bumps would grow in slow motion into a bona 
fide discovery.” 
• They give the bump a hard time, look at multiple decay 
channels, and don’t tell the details of where they found her 
to the other team. 
• When two independent experiments find the same particle 
signal at the same mass, it overcomes the multiple testing 
and gives a strong argument. 
27
11/5 
28 
(VII) Possible Anomalies for SM 
They also follow up bumps indicating discrepancies with 
H[2] 
0 SM Higgs boson: μ = 1 
Hints of anomalies with the “plain vanilla” particle of the 
Standard Model 
(viewed as tests or corresponding interval estimates) 
Even a year later they examined these anomalies with more 
data.
11/5 
29 
Curb your enthusiasm 
Matt Strassler: “The excess (in favor of BSM properties) 
became a bit smaller each time…. That’s an unfortunate sign, 
if one is hoping the excess isn’t just a statistical fluke.” 
Or they’d see the bump at ATLAS… and not CMS 
“Taking all of the data, and not cherry picking…there’s 
nothing here that you can call “evidence” for the much sought 
BSM.” (Strassler) 
Considering the frequent flukes, and the hot competition 
between the ATLAS and CMS to be first, a tool for when to 
“curb their enthusiasm” seems exactly what was wanted.
11/5 
So, this “negative” portion involves: 
(a) denying BSM anomalies are real 
(b) setting upper bounds for these discrepancies with the SM 
30 
Higgs 
Each with its own test statistic and evidence g(x0) 
H[2] 
0 : SM Higgs boson: μ = 1 
Failing to reject the null isn’t evidence for it, but they could set 
upper bounds.
11/5 
31 
Test Principle #2 (for non-significance): Data provide 
evidence to rule out a discrepancy δ∗ to the extent that a 
larger g(x0) would very probably have resulted if δ were as 
great as δ∗ 
Detach δ < δ∗ 
(could equivalently be viewed as inferring a confidence 
interval estimate δ = g(x0) + ε) 
So these tools seem just the thing for this research
11/5 
32 
(VIII) Conclusion O’Hagan published a digest of responses a 
few days later 
• “They surely would be willing to announce SM Higgs 
discovery if they were 99.99% certain of the existence of 
the SM Higgs” (and avoid the ad hoc 5 sigma) 
Pr(SM Higgs) = .9999 
• It would require prior probabilities to “SM Higgs” claim, 
and prior distribution on the numerous “nuisance” 
parameters of the background and the signal. 
• Multivariate priors, correlations between parameters, joint 
priors, and the catchall: P(data|not H*)
11/5 
33 
• Even if all that were done and agreed upon, it would not 
have given the kind of tools needed to find things out 
Worse: spiked priors Pr(No SM Higgs)= Pr(SM Higgs)=.5 
(not uninformative) 
• Physicists believed in SM Higgs before building the big 
collider, given the perfect predictive success of SM, its 
simplicity–very different than having evidence for a 
discovery. 
• Others may believe (and fervently wish) that it will break 
down somewhere.
11/5 
34 
P-value police: Those who think we want a posterior 
probability in H* might be sliding from what may be inferred 
from this legitimate high probability: 
Pr(test T would not reach 5 sigma; H0) > .9999997 
With probability .9999997, our methods would show 
that the bumps disappear, under the assumption data 
are due to background H0. 
They don’t disappear but grow. 
Infer H* 
Qualified by the test properties
11/5 
35 
What’s passed with high severity? 
H*: a Higgs boson consistent with the SM (at the levels 
of precision and accuracy of these experiments) 
An adequate account should also always report alternatives 
that have not been well ruled out 
• measurements not precise enough to rule out discrepancies 
from a SM Higgs as large as 10%, 20%, 50%. 
• There are rivals to the SM that would not have been 
distinguishable with the given data (which went through a 
lot of filtering, and triggering rules). 
They will get more data in 2015, there’s talk of a more 
precise detector being built
11/5 
36 
REFERENCES (Online links) 
• Atlas 
report: 
http://cds.cern.ch/record/1494183/files/ATLAS-­‐ 
CONF-­‐2012-­‐162.pdf 
• Atlas 
Higgs 
experiment, 
public 
results: 
https://twiki.cern.ch/twiki/bin/view/AtlasPublic/HiggsPublicRes 
ults 
• CMS 
Higgs 
experiment, 
public 
results: 
https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhysicsResultsH 
IG 
• Mayo, 
D. 
G. 
and 
Cox, 
D. 
R. 
(2010). 
"Frequentist 
Statistics 
as 
a 
Theory 
of 
Inductive 
Inference" 
in 
Error 
and 
Inference: 
Recent 
Exchanges 
on 
Experimental 
Reasoning, 
Reliability 
and 
the 
Objectivity 
and 
Rationality 
of 
Science 
(D 
Mayo 
and 
A. 
Spanos 
eds.), 
Cambridge: 
Cambridge 
University 
Press: 
1-­‐27. 
This 
paper 
appeared 
in 
The 
Second 
Erich 
L. 
Lehmann 
Symposium: 
Optimality, 
2006, 
Lecture
11/5 
37 
Notes-­‐Monograph 
Series, 
Volume 
49, 
Institute 
of 
Mathematical 
Statistics, 
pp. 
247-­‐275. 
• Cousins, 
R. 
(2014). 
“The Jeffreys-Lindley Paradox and Discovery 
Criteria in High Energy Physics” http://arxiv.org/abs/1310.3791 
• O’Hagan 
letter: 
§ Original 
letter 
with 
responses: 
http://bayesian.org/forums/news/3648 
§ 1st 
link 
in 
a 
group 
of 
discussions 
of 
the 
letter: 
http://errorstatistics.com/2012/07/11/is-­‐particle-­‐ 
physics-­‐bad-­‐science/ 
• Overbye, 
D. 
(March 
15, 
2013) 
“Chasing 
the 
Higgs,” 
New 
York 
Times: 
http://www.nytimes.com/2013/03/05/science/chasing-­‐the-­‐ 
higgs-­‐boson-­‐how-­‐2-­‐teams-­‐of-­‐rivals-­‐at-­‐CERN-­‐searched-­‐for-­‐physics-­‐ 
most-­‐elusive-­‐particle.html?pagewanted=all&_r=0
11/5 
38 
• Spiegelhalter, 
D. 
(August 
7, 
2012) 
blog, 
Understanding 
Uncertainty 
, 
“Explaining 
5 
sigma 
for 
the 
Higgs: 
how 
well 
did 
they 
do?” 
http://understandinguncertainty.org/explaining-­‐5-­‐sigma-­‐higgs-­‐ 
how-­‐well-­‐did-­‐they-­‐do 
• Strassler, 
M. 
(July 
2, 
2013) 
blog, 
Of 
Particular 
Significance, 
“A 
Second 
Higgs 
Particle”: 
http://profmattstrassler.com/2013/07/02/a-­‐second-­‐higgs-­‐ 
particle/ 
• Wasserman, 
L. 
(July 
11, 
2012) 
blog, 
Normal 
Deviate, 
“The 
Higgs 
Boson 
and 
the 
P-­‐Value 
Police”: 
http://normaldeviate.wordpress.com/2012/07/11/the-­‐higgs-­‐ 
boson-­‐and-­‐the-­‐p-­‐value-­‐police/

More Related Content

What's hot

Spanos lecture 7: An Introduction to Bayesian Inference
Spanos lecture 7: An Introduction to Bayesian Inference Spanos lecture 7: An Introduction to Bayesian Inference
Spanos lecture 7: An Introduction to Bayesian Inference jemille6
 
Mayo &amp; parker spsp 2016 june 16
Mayo &amp; parker   spsp 2016 june 16Mayo &amp; parker   spsp 2016 june 16
Mayo &amp; parker spsp 2016 june 16jemille6
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyjemille6
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019jemille6
 
Final mayo's aps_talk
Final mayo's aps_talkFinal mayo's aps_talk
Final mayo's aps_talkjemille6
 
Controversy Over the Significance Test Controversy
Controversy Over the Significance Test ControversyControversy Over the Significance Test Controversy
Controversy Over the Significance Test Controversyjemille6
 
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
 
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"jemille6
 
Phil 6334 Mayo slides Day 1
Phil 6334 Mayo slides Day 1Phil 6334 Mayo slides Day 1
Phil 6334 Mayo slides Day 1jemille6
 
Byrd statistical considerations of the histomorphometric test protocol (1)
Byrd statistical considerations of the histomorphometric test protocol (1)Byrd statistical considerations of the histomorphometric test protocol (1)
Byrd statistical considerations of the histomorphometric test protocol (1)jemille6
 
Senn repligate
Senn repligateSenn repligate
Senn repligatejemille6
 
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”jemille6
 
D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy jemille6
 
Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13jemille6
 
D. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics WarsD. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics Warsjemille6
 
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)jemille6
 
Exploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory ResearchExploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory Researchjemille6
 
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...jemille6
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively jemille6
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)jemille6
 

What's hot (20)

Spanos lecture 7: An Introduction to Bayesian Inference
Spanos lecture 7: An Introduction to Bayesian Inference Spanos lecture 7: An Introduction to Bayesian Inference
Spanos lecture 7: An Introduction to Bayesian Inference
 
Mayo &amp; parker spsp 2016 june 16
Mayo &amp; parker   spsp 2016 june 16Mayo &amp; parker   spsp 2016 june 16
Mayo &amp; parker spsp 2016 june 16
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severely
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
 
Final mayo's aps_talk
Final mayo's aps_talkFinal mayo's aps_talk
Final mayo's aps_talk
 
Controversy Over the Significance Test Controversy
Controversy Over the Significance Test ControversyControversy Over the Significance Test Controversy
Controversy Over the Significance Test Controversy
 
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...
 
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
 
Phil 6334 Mayo slides Day 1
Phil 6334 Mayo slides Day 1Phil 6334 Mayo slides Day 1
Phil 6334 Mayo slides Day 1
 
Byrd statistical considerations of the histomorphometric test protocol (1)
Byrd statistical considerations of the histomorphometric test protocol (1)Byrd statistical considerations of the histomorphometric test protocol (1)
Byrd statistical considerations of the histomorphometric test protocol (1)
 
Senn repligate
Senn repligateSenn repligate
Senn repligate
 
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
 
D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy D. Mayo: Replication Research Under an Error Statistical Philosophy
D. Mayo: Replication Research Under an Error Statistical Philosophy
 
Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13
 
D. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics WarsD. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics Wars
 
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
 
Exploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory ResearchExploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory Research
 
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)
 

Similar to Statistical Flukes, the Higgs Discovery, and 5 Sigma

D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in ScienceD. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in Sciencejemille6
 
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper ConceptsExcursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper Conceptsjemille6
 
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...jemille6
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsificationjemille6
 
D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500jemille6
 
The Statistics Wars: Errors and Casualties
The Statistics Wars: Errors and CasualtiesThe Statistics Wars: Errors and Casualties
The Statistics Wars: Errors and Casualtiesjemille6
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)jemille6
 
Philosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of StatisticsPhilosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of Statisticsjemille6
 
D. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &LearningD. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &Learningjemille6
 
“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”jemille6
 
An Alternative To Null-Hypothesis Significance Tests
An Alternative To Null-Hypothesis Significance TestsAn Alternative To Null-Hypothesis Significance Tests
An Alternative To Null-Hypothesis Significance TestsSarah Morrow
 
Top schools in delhi ncr
Top schools in delhi ncrTop schools in delhi ncr
Top schools in delhi ncrEdhole.com
 
Test of significance
Test of significanceTest of significance
Test of significanceAftab Kazi
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilismjemille6
 
Morestatistics22 091208004743-phpapp01
Morestatistics22 091208004743-phpapp01Morestatistics22 091208004743-phpapp01
Morestatistics22 091208004743-phpapp01mandrewmartin
 
Chapter 0: the what and why of statistics
Chapter 0: the what and why of statisticsChapter 0: the what and why of statistics
Chapter 0: the what and why of statisticsChristian Robert
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 

Similar to Statistical Flukes, the Higgs Discovery, and 5 Sigma (20)

D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in ScienceD. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
 
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper ConceptsExcursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
 
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
 
D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500
 
The Statistics Wars: Errors and Casualties
The Statistics Wars: Errors and CasualtiesThe Statistics Wars: Errors and Casualties
The Statistics Wars: Errors and Casualties
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
 
Philosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of StatisticsPhilosophy of Science and Philosophy of Statistics
Philosophy of Science and Philosophy of Statistics
 
Mayod@psa 21(na)
Mayod@psa 21(na)Mayod@psa 21(na)
Mayod@psa 21(na)
 
D. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &LearningD. G. Mayo Columbia slides for Workshop on Probability &Learning
D. G. Mayo Columbia slides for Workshop on Probability &Learning
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
More Statistics
More StatisticsMore Statistics
More Statistics
 
“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
An Alternative To Null-Hypothesis Significance Tests
An Alternative To Null-Hypothesis Significance TestsAn Alternative To Null-Hypothesis Significance Tests
An Alternative To Null-Hypothesis Significance Tests
 
Top schools in delhi ncr
Top schools in delhi ncrTop schools in delhi ncr
Top schools in delhi ncr
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
 
Morestatistics22 091208004743-phpapp01
Morestatistics22 091208004743-phpapp01Morestatistics22 091208004743-phpapp01
Morestatistics22 091208004743-phpapp01
 
Chapter 0: the what and why of statistics
Chapter 0: the what and why of statisticsChapter 0: the what and why of statistics
Chapter 0: the what and why of statistics
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 

More from jemille6

D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfjemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfjemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inferencejemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?jemille6
 
What's the question?
What's the question? What's the question?
What's the question? jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metasciencejemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Twojemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testingjemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredgingjemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probabilityjemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...jemille6
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...jemille6
 
The Statistics Wars and Their Casualties
The Statistics Wars and Their CasualtiesThe Statistics Wars and Their Casualties
The Statistics Wars and Their Casualtiesjemille6
 

More from jemille6 (20)

D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 
The Statistics Wars and Their Casualties
The Statistics Wars and Their CasualtiesThe Statistics Wars and Their Casualties
The Statistics Wars and Their Casualties
 

Recently uploaded

Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

Statistical Flukes, the Higgs Discovery, and 5 Sigma

  • 1. 11/5 1 Statistical Flukes, the Higgs Discovery, and 5 Sigma Deborah G. Mayo Virginia Tech (I) “5 sigma observed effect”. One of the biggest science events of 2012-13 was the announcement on July 4, 2012 of evidence for the discovery of a Higgs particle based on a “5 sigma observed effect”. With the March 2013 data analysis, the 5 sigma difference grew to 7 sigmas.
  • 2. 11/5 2 • Because the 5 sigma report refers to frequentist statistical tests, the discovery was immediately imbued with controversies from philosophy of statistics • I’m an outsider to high energy physics, HEP, but (aside from finding it fascinating), any philosopher of statistics worth her salt should be able to illuminate some of the more public controversies e.g., P-values. Not difficult to do, fortunately.
  • 3. 11/5 (II) Bad Science? (O’Hagan, prompted by Lindley) To the ISBA: “Dear Bayesians: We’ve heard a lot about the Higgs boson. ...Specifically, the news referred to a confidence interval with 5-sigma limits.… Five standard deviations, assuming normality, means a p-value of around 0.0000005… Why such an extreme evidence requirement? We know from a Bayesian perspective that this only makes sense if (a) the existence of the Higgs boson has extremely small prior probability and/or (b) the consequences of erroneously announcing its discovery are dire in the extreme. … …. Are the particle physics community completely wedded to frequentist analysis? If so, has anyone tried to explain what bad science that is?” 3
  • 4. 11/5 4 Not bad science at all! • HEP physicists are sophisticated with their statistical methodology: they’d seen too many bumps disappear. • They want to ensure that before announcing the hypothesis H*: “a new particle has been discovered” that: H* has been given a severe run for its money. Significance tests and cognate methods (confidence intervals) are methods of choice here for good reason
  • 5. 11/5 5 (III) Simple statistical significance test: ingredients (i) Null or test hypothesis: in terms of an unknown parameter μ in a statistical model, an idealized representation of underlying data generation: a model of the detector μ is the “global signal strength” parameter H0: μ = 0 i.e., zero signal (background only hypothesis) Η0: μ = 0 vs. Η1: μ > 0 μ = 1: Standard Model (SM) Higgs boson signal in addition to the background
  • 6. 11/5 6 Empirical data are modeled as observed values of a sample X (random variable); here numbers of events of a type. (ii). Test statistic or distance statistic: d(X)—the larger its value the more inconsistent the data are with Η0 in the direction of alternatives or discrepancies of interest. d(X): how many excess events of a given type are observed (from trillions of collisions) in comparison to what would be expected from background alone (in the form of bumps). d(X) has a known probability distribution under Η0 (and under various alternatives).
  • 7. 11/5 (iii). The P-value (or significance level) associated with d(x0) is the probability of a difference as large or larger than d(x0), under the assumption that H0 is true: 7 P-value=Pr(d(X) > d(x0); H0) If the P-value is sufficiently small (e.g., .05, .01, .001) d(x0) is said to be statistically significant (or significant at the level reached) d(X) can be given in terms of standard deviation units, or sigma units
  • 8. 11/5 8 The distribution of statistic d(X) is the sampling distribution Pr(d(X) > 1; H0) = .16 Pr(d(X) > 2; H0) = .02 Pr(d(X) > 3; H0) =.001 Pr(d(X) > 4; H0) = .00003 Pr(d(X) > 5; H0)= .0000003 The probability of observing results as or more extreme as 5 sigmas, under H0, is approximately 1 in 3,500,000.
  • 9. 11/5 9 Normal distribution
  • 10. 11/5 The actual computations are based on simulating what it would be like were Η0: μ = 0 (signal strength = 0), fortified with much cross-checking of results. So the significance test has: 1) Data x0 and hypotheses Η0: μ = 0 vs. Η1: μ > 0 2) A (distance) test statistic d(X) 3) Probability distribution of d(X) under the null and various 10 alternatives
  • 11. 11/5 11 There’s generally a rule of interpretation: • if d(X) > 5 sigma, infer discovery • if d(X) > 2 sigma, get more data We want methods with high capability to detect discrepancies while avoiding mistaking spurious bumps as real.
  • 12. 11/5 12 • First stage: test for a real effect (Cox’s taxonomy: searching for structure) Not a point against point test! Cousins: H0 is Standard Model (SM) missing a piece • Second stage: determine its properties, test SM vs “Beyond SM” (BSM) (Cox: embedded)
  • 13. 11/5 13 (IV) The P-Value Police When the July 2012 report came out, a number of people set out to grade the different interpretations of the P-value report: Larry Wasserman (“Normal Deviate” on his blog) called them the “P-Value Police”. • Job: to examine if reports by journalists and scientists could by any stretch of the imagination be seen to have misinterpreted the sigma levels as posterior probability assignments to the various models and claims. David Spiegelhalter: A well-known (Bayesian) statistician: risk communication.
  • 14. 11/5 14 Thumbs up or down Thumbs up, to the ATLAS group report: “A statistical combination of these channels and others puts the significance of the signal at 5 sigma, meaning that only one experiment in three million would see an apparent signal this strong in a universe without a Higgs.” Thumbs down to reports such as: “There is less than a one in 3.5 million chance that their results are a statistical fluke.” Critics (Spiegelhalter) allege they are misinterpreting the P-value as a posterior probability on H0.
  • 15. 11/5 15 Not so. H0 does not say the observed results are due to background alone, or are flukes, Η0: μ = 0 Although if H0 were true it follows that various results would occur with specified probabilities. (In particular, it entails that large bumps are improbable.)
  • 16. 11/5 In fact it is an ordinary error probability. Since it’s not just a single result, but a dynamic test procedure, we can write it: 16 (1) Pr(Test T produces d(X) > 5; H0) ≤ .0000003 Note: (1) is not a conditional probability (that involves a prior) Pr(Test T produces d(X) > 5 and H0)/ Pr(H0)
  • 17. 11/5 17 (V) Detaching inference(s) from the evidence True, the inference actually detached goes beyond a P-value report. Infer: (2)There is strong evidence for (first) a genuine discrepancy from H0 (later) H*: a Higgs (or a Higgs-like) particle. Gradations: indication, evidence, discovery (up to July 4, 2012) Inferring (2) relies on an implicit principle of evidence.
  • 18. 11/5 Test Principle #1: (statistical significance) Data provide evidence for a genuine discrepancy from H0 (just) to the extent that H0 would (very probably) have survived, were H0 a reasonably adequate description of the process generating the data. (1)’ Pr(Test T produces d(X) < 5; H0) > .9999997 • With probability .9999997, the bumps would be smaller, would behave like flukes, disappear with more data, not be produced at both CMS and ATLAS, in a world given by H0. • They didn’t disappear, they grew (2) So, H*: a Higgs (or a Higgs-like) particle. 18
  • 19. 11/5 19 Following the rule: Interpret 5 sigma bumps as a real effect (a discrepancy from 0), you’d erroneously interpret data with probability less than .0000003 An error probability The warrant isn’t low long-run error (in a case like this) but detaching an inference based on “strong argument from coincidence”. Qualifying claims by how well they have been probed (precision, accuracy).
  • 20. 11/5 Second Stage Once the null is rejected, the job shifts to testing if various parameters agree with the SM predictions. Now the corresponding null hypothesis is the SM Higgs boson The null hypothesis at the second stage 20 H[2] 0: SM Higgs boson: μ = 1 and discrepancies from it are probed, estimated with confidence intervals (Cousins)
  • 21. 11/5 21 Takes us to the most important role served by statistical significance tests: (requiring a 5 sigma excess for discovery): It affords a standard for: • (a) denying sufficient evidence of a new particle, inferring “not a genuine effect”, and • (b) ruling out values of various parameters, e.g., mass ranges.
  • 22. 11/5 22 (VI) Positive and Negative test results of the analysis Positive (very low P-value): infer genuine effects Negative (moderate P-value): deny real effects (infer flukes), Deny excesses indicate BSM. • At both stages, they were engaged in exploration for BSM physics (beyond the standard model) • It combined testing, estimating, exploring.
  • 23. 11/5 23 NYT: “Chasing the Higgs” [Dennis Overbye interviews spokespeople Gianotti (ATLAS) and Tonelli (CMS).] • Once a month they got bumps that were random flukes “So ‘we crosscheck everything’ and ‘try to kill’ any anomaly that might be merely random.” They were convinced they had found evidence of extra dimensions of space time “and then the signal faded like an old tied balloon.”
  • 24. 11/5 24 • “We’ve made many discoveries,” Dr. Tonelli said, “most of them false.” • “Ninety-nine percent of the time, that is just what happens.” What’s the difference between HEP physics and social psychology (and other big data screening) where “most results in most fields are false”, or so we keep hearing? HEP physicists don’t publish on the basis of a single “nominal” (or “local”) P-value.
  • 25. 11/5 25 Look Elsewhere Effect (LEE) A nominal (or local) P-value: the P-value at a particular, data-determined, mass. But the probability of so impressive a difference anywhere in a mass range would be greater than the local one. I take it that requiring a smaller P-value (i.e., bigger difference), at least 5 sigma, is akin to adjusting for multiple trials or look elsewhere effect LEE.
  • 26. 11/5 26 “Game of Bump-Hunting” (Overbye) “One bump on physicists’ charts…was disappearing. But another was blooming like the shy girl at a dance. …. nobody could remember exactly when she had come in. But she was the one who would marry the prince.” “It continued to grow over the fall until it had reached the 3- sigma level — the chances of being a fluke [spurious significance] were less than 1 in 740, enough for physicists to admit it to the realm of “evidence” of something, but not yet a discovery.”
  • 27. 11/5 Background knowledge of how flukes behave: • “If they were flukes, more data would make them fade into the statistical background, • If not, the bumps would grow in slow motion into a bona fide discovery.” • They give the bump a hard time, look at multiple decay channels, and don’t tell the details of where they found her to the other team. • When two independent experiments find the same particle signal at the same mass, it overcomes the multiple testing and gives a strong argument. 27
  • 28. 11/5 28 (VII) Possible Anomalies for SM They also follow up bumps indicating discrepancies with H[2] 0 SM Higgs boson: μ = 1 Hints of anomalies with the “plain vanilla” particle of the Standard Model (viewed as tests or corresponding interval estimates) Even a year later they examined these anomalies with more data.
  • 29. 11/5 29 Curb your enthusiasm Matt Strassler: “The excess (in favor of BSM properties) became a bit smaller each time…. That’s an unfortunate sign, if one is hoping the excess isn’t just a statistical fluke.” Or they’d see the bump at ATLAS… and not CMS “Taking all of the data, and not cherry picking…there’s nothing here that you can call “evidence” for the much sought BSM.” (Strassler) Considering the frequent flukes, and the hot competition between the ATLAS and CMS to be first, a tool for when to “curb their enthusiasm” seems exactly what was wanted.
  • 30. 11/5 So, this “negative” portion involves: (a) denying BSM anomalies are real (b) setting upper bounds for these discrepancies with the SM 30 Higgs Each with its own test statistic and evidence g(x0) H[2] 0 : SM Higgs boson: μ = 1 Failing to reject the null isn’t evidence for it, but they could set upper bounds.
  • 31. 11/5 31 Test Principle #2 (for non-significance): Data provide evidence to rule out a discrepancy δ∗ to the extent that a larger g(x0) would very probably have resulted if δ were as great as δ∗ Detach δ < δ∗ (could equivalently be viewed as inferring a confidence interval estimate δ = g(x0) + ε) So these tools seem just the thing for this research
  • 32. 11/5 32 (VIII) Conclusion O’Hagan published a digest of responses a few days later • “They surely would be willing to announce SM Higgs discovery if they were 99.99% certain of the existence of the SM Higgs” (and avoid the ad hoc 5 sigma) Pr(SM Higgs) = .9999 • It would require prior probabilities to “SM Higgs” claim, and prior distribution on the numerous “nuisance” parameters of the background and the signal. • Multivariate priors, correlations between parameters, joint priors, and the catchall: P(data|not H*)
  • 33. 11/5 33 • Even if all that were done and agreed upon, it would not have given the kind of tools needed to find things out Worse: spiked priors Pr(No SM Higgs)= Pr(SM Higgs)=.5 (not uninformative) • Physicists believed in SM Higgs before building the big collider, given the perfect predictive success of SM, its simplicity–very different than having evidence for a discovery. • Others may believe (and fervently wish) that it will break down somewhere.
  • 34. 11/5 34 P-value police: Those who think we want a posterior probability in H* might be sliding from what may be inferred from this legitimate high probability: Pr(test T would not reach 5 sigma; H0) > .9999997 With probability .9999997, our methods would show that the bumps disappear, under the assumption data are due to background H0. They don’t disappear but grow. Infer H* Qualified by the test properties
  • 35. 11/5 35 What’s passed with high severity? H*: a Higgs boson consistent with the SM (at the levels of precision and accuracy of these experiments) An adequate account should also always report alternatives that have not been well ruled out • measurements not precise enough to rule out discrepancies from a SM Higgs as large as 10%, 20%, 50%. • There are rivals to the SM that would not have been distinguishable with the given data (which went through a lot of filtering, and triggering rules). They will get more data in 2015, there’s talk of a more precise detector being built
  • 36. 11/5 36 REFERENCES (Online links) • Atlas report: http://cds.cern.ch/record/1494183/files/ATLAS-­‐ CONF-­‐2012-­‐162.pdf • Atlas Higgs experiment, public results: https://twiki.cern.ch/twiki/bin/view/AtlasPublic/HiggsPublicRes ults • CMS Higgs experiment, public results: https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhysicsResultsH IG • Mayo, D. G. and Cox, D. R. (2010). "Frequentist Statistics as a Theory of Inductive Inference" in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 1-­‐27. This paper appeared in The Second Erich L. Lehmann Symposium: Optimality, 2006, Lecture
  • 37. 11/5 37 Notes-­‐Monograph Series, Volume 49, Institute of Mathematical Statistics, pp. 247-­‐275. • Cousins, R. (2014). “The Jeffreys-Lindley Paradox and Discovery Criteria in High Energy Physics” http://arxiv.org/abs/1310.3791 • O’Hagan letter: § Original letter with responses: http://bayesian.org/forums/news/3648 § 1st link in a group of discussions of the letter: http://errorstatistics.com/2012/07/11/is-­‐particle-­‐ physics-­‐bad-­‐science/ • Overbye, D. (March 15, 2013) “Chasing the Higgs,” New York Times: http://www.nytimes.com/2013/03/05/science/chasing-­‐the-­‐ higgs-­‐boson-­‐how-­‐2-­‐teams-­‐of-­‐rivals-­‐at-­‐CERN-­‐searched-­‐for-­‐physics-­‐ most-­‐elusive-­‐particle.html?pagewanted=all&_r=0
  • 38. 11/5 38 • Spiegelhalter, D. (August 7, 2012) blog, Understanding Uncertainty , “Explaining 5 sigma for the Higgs: how well did they do?” http://understandinguncertainty.org/explaining-­‐5-­‐sigma-­‐higgs-­‐ how-­‐well-­‐did-­‐they-­‐do • Strassler, M. (July 2, 2013) blog, Of Particular Significance, “A Second Higgs Particle”: http://profmattstrassler.com/2013/07/02/a-­‐second-­‐higgs-­‐ particle/ • Wasserman, L. (July 11, 2012) blog, Normal Deviate, “The Higgs Boson and the P-­‐Value Police”: http://normaldeviate.wordpress.com/2012/07/11/the-­‐higgs-­‐ boson-­‐and-­‐the-­‐p-­‐value-­‐police/