SlideShare a Scribd company logo
1 of 29
Download to read offline
2nd half of “Frequentist Statistics as a Theory of Inductive
Inference” (Selection Effects)
The idealized formulation in the initial definition of a significance
test starts with a hypothesis and a test statistic, obtains data, then
applies the test and looks at the outcome:
 The hypothetical procedure involved in the definition of the
test then matches reasonably closely what was done;
 The possible outcomes are the different possible values of the
specified test statistic.
 This permits features of the distribution of the test statistic to
be relevant for learning about aspects of the mechanism
generating the data.

1
It often happens that either the null hypothesis or the test statistic
are influenced by preliminary inspection of the data, so that the
actual procedure generating the final test result is altered:*
 This, may alter the capabilities of the test to detect
discrepancies from the null hypotheses reliably, calling for
adjustments in its error probabilities.
 This is required to ensure that the p-values serve their intended
purpose for frequentist inference, whether in behavioral or
evidential contexts.
* the objective of the test is to enable us to learn something about
the underlying data generating mechanism, and this learning is
made possible by correctly assessing the actual error probabilities.

2
Ad hoc Hypotheses, Non-novel Data, Double-Counting, etc.
The general point involved has been discussed extensively in both
philosophical and statistical literatures.
 In the former under such headings as requiring novelty or
avoiding ad hoc hypotheses (use-constructions, etc.)
 Under the latter, as rules against peeking at the data, shopping
for significance, data mining, etc., for taking selection effects into
account.
(This will come up again throughout the semester. Optional
stopping is an example of a data dependent strategy, as with “look
elsewhere” effects in the Higgs research)
These problems remain unresolved in general..
3
Error statistical considerations, coupled with a sound principle of
inductive evidence, may allow going further by providing criteria
for when various data dependent selections matter and how to take
account of their influence on error probabilities.
(some items of Mayo and Spanos: “How to discount double
counting when it counts,” “Some surprising facts about surprising
facts”, chapters 7,8,9 of EGEK, especially 8, “hunting without a
license” Spanos)

4
In particular, if the null hypothesis chosen for testing just because
the test statistic is large, the probability of finding some such
discordance or other may be high even under the null.
Thus, following FEV, we would not have genuine evidence of
inconsistency with the null, and unless the p-value is modified
accordingly, the inference would be misleading.

5
Example 1: Hunting for Statistical Significance
Investigators have 20 independent sets of data, each reporting on
different but closely related effects.
After doing all 20 tests, with 20 nulls, H0i, i = 1, …20
they report only the smallest p-value, e.g., 0.05, and its
corresponding null hypothesis, say H013.
e.g., there is no difference between some treatment (a childhood
training regimen) and a factor, f13 (some personality characteristic
later in life).
Passages from EGEK (Morrison and Henkel)

6
This “hunting” procedure should be compared with a case where
H013 was preset as the single hypothesis to test, and the small pvalue found.
 In the hunting case, the possible results are the possible
statistically significant factors that might be found to show a
"calculated" statistical significant departure from the null. The
relevant type 1 error probability is the probability of finding at
least one such significant difference out of 20, even though the
global null is true (i.e., all twenty observed differences are due
to chance).
 The probability that this procedure yields erroneous rejection
differs from, and will be much greater than, 0.05 (and is
approximately 0.64).

7
 There are different, and indeed many more, ways one can err in
this example than when one null is preset, and this is reflected
in the adjusted p-value.
My blog (reblogged March 3, 2014)
Hardly a day goes by where I do not come across an article on the
problems for statistical inference based on fallaciously capitalizing
on chance: high-powered computer searches and “big” data trolling
offer rich hunting grounds out of which apparently impressive
results may be “cherry-picked”:
When the hypotheses are tested on the same data that suggested
them and when tests of significance are based on such data, then a
spurious impression of validity may result. The computed level of
significance may have almost no relation to the true level. . . .

8
Suppose that twenty sets of differences have been examined, that
one difference seems large enough to test and that this difference
turns out to be “significant at the 5 percent level.” Does this mean
that differences as large as the one tested would occur by chance
only 5 percent of the time when the true difference is zero? The
answer is no, because the difference tested has been selected from
the twenty differences that were examined. The actual level of
significance is not 5 percent, but 64 percent! (Selvin 1970, 104)[1]
Critics of the Morrison and Henkel ilk clearly report that to ignore
a variety of “selection effects” results in a fallacious computation
of the actual significance level associated with a given inference;
the “computed” or “nominal” significance level differs from the
actual or warranted significance level.

9
[1] Selvin calculates this approximately by considering the
probability of finding at least one statistically significant difference
at the .05 level when 20 independent samples are drawn from
populations having true differences of zero, 1 – P (no such
difference): 1 – (.95)20 = 1 – .36. This assumes, unrealistically,
independent samples, but without that it may be unclear how to
even approximately compute actual p-values.
This influence on long-run error is well known, but should this
influence the interpretation of the result in a context of inductive
inference?
According to frequentist or severity reasoning it should
Not so easy to explain why:

10
The concern is not the avoidance of often announcing genuine
effects erroneously in a series, the concern is that this test performs
poorly as a tool for discriminating genuine from chance effects in
this case.
 Because at least one such impressive departure, we know, is
common even if all are due to chance, the test has scarcely
reassured us that it has done a good job of avoiding such a
mistake in this case.
 Even if there are other grounds for believing the genuineness of
the one effect that is found, we deny that this test alone has
supplied such evidence.

11
The "hunting procedure" does a very poor job in alerting us to,
in effect, temper our enthusiasm, even where such tempering is
warranted.
 If the p-value is adjusted to reflect the actual error rate, the
test again becomes a tool that serves this purpose.

12
Example 2. Hunting for a Murderer
(hunting for the source of a known effect by eliminative induction)
Testing for a DNA match with a given specimen, known to be that
of the murderer, a search through a data-base of possible matches
is done one at a time.
We are told, in a fairly well-known presentation of this case, that:
P(DNA match; not murderer) = very small
P(DNA match; murderer) ~ 1
The first individual, if any, from the data-base for which a match is
found is declared to truly match the criminal, i.e., to be the
murderer.

13
(The null hypothesis, in effect, asserts that the person tested does
NOT “match the criminal”; so the null is rejected iff there is an
observed DNA match.)
Example 2 is superficially similar to Example 1, finding a DNA
match being somewhat akin to finding a statistically significant
departure from a null hypothesis: one searches through data and
concentrates on the one case where a "match" with the criminal's
DNA is found, ignoring the non-matches.
If one adjusts for "hunting" in Example 1, shouldn't one do so
in broadly the same way in Example 2?

14
No!
(Although some have erroneously supposed frequentists say “yes”)
In Example 1 the concern is inferring a genuine, “reproducible"
effect, when in fact no such effect exists;
In Example 2, there is a known effect or specific event, the
criminal's DNA, and reliable procedures are used to track down the
specific cause or source (as conveyed by the low "erroneousmatch" rate.)

15
 The probability is high that we would not obtain a match with
person i, if i were not the criminal; so, by FEV, finding the
match is excellent evidence that i is the criminal. Moreover,
each non-match found, by the stipulations of the example,
virtually excludes that person;
Note: the contrast in hunting for a DNA match is finding a match
with the first person tested, as opposed to hunting through a data
base
 The more negative results found, the more the inferred "match"
is fortified; whereas in Example 1 this is not so.

16
17
Data-dependent Specification of distance or “cut-offs”
(case 1)
An analogy — The Texas Sharpshooter: testing a
sharpshooter's ability by having him shoot and then drawing a
bull's-eye around his results so as to yield the highest number of
bull's-eyes,
 The skill that one is allegedly testing and making inferences
about is his ability to shoot when the target is given and fixed,
while that is not the skill actually responsible for the resulting
score.

18
Case 2:
By contrast, if the choice of specification is guided not by
considerations of the statistical significance of departure from the
original null hypothesis, but rather because one is an empirically
adequate statistical model, the other violates assumptions, no
adjustment for selection is called for.
 Indeed, using a statistically adequate specification gives
reassurance that the calculated p-value is relevant for
interpreting the evidence reliably.

19
Need for Adjustments for Data-Dependent Selections
How does our conception of the frequentist theory of induction
help to guide the answers?
1. It must be considered whether the context is one where the key
concern is the control of error rates in a series of applications
(behavioristic goal), or whether it is a context of evaluating
specific evidence (inferential goal).
The relevant error probabilities may be altered for the former
context and not for the latter.
2. To determine the relevant hypothetical series on which to base
error frequencies one must identify the particular obstacles that
need to be avoided for a reliable inference in the particular case,

20
and the capacity of the test, as a measuring instrument, to have
revealed the presence of the obstacle.

21
Statistics in the Discovery of the Higgs
 Everyone was excited with the announced evidence for the
discovery of a standard model (SM) Higgs particle based on a
“5 sigma observed effect” (July 2012).
 But because this report links to the use of statistical significance
tests, some Bayesians raised criticisms

22
 They want to ensure that before announcing the hypothesis H*:
“a SM Higgs boson has been discovered” (with such and such
properties) that
H* has been given a severe run for its money
That with extremely high probability we would have observed
a smaller excess of signal-like events, were we in a universe
where:
H0: μ = 0 —background only hypothesis, vs.
So, very probably H0 would have survived a cluster of tests,
fortified with much cross-checking T, were μ = 0.

23
Note what’s being given a high probability:
Pr(test T would produce less than 5 sigma; H0) > 9999997.
With probability .9999997, our methods would show that the
bumps disappear (as so often occurred), under the assumption
data are due to background H0.
Assuming we want a posterior probability in H* seems to be a
slide from the value of knowing this probability is high for
assessing the warrant for H*
Granted, this inference relies on an implicit severity principle
of evidence.

24
Data provide good evidence for inferring H (just) to the extent
that H passes severely with x0, i.e., to the extent that H would
(very probably) not have survived the test so well were H false.
They then quantify various properties of the particle discovered
(inferring ranges of magnitudes)

25
The p-value police
 Leading (subjective) Bayesian, Dennis Lindley had a letter
sent around (to ISBA members)i:
 Why demand such strong evidence?
 (Could only be warranted if beliefs in the Higgs extremely
low or costs of error exorbitant.)
 Are they so wedded to frequentist methods? Lindley asks.
“If so, has anyone tried to explain what bad science that is?”

26
 Other critics rushed in to examine if reports (by journalists
and scientists) misinterpreted the sigma levels as posterior
probability assignments to the models.
 Many critics have claimed that the .99999 was fallaciously
being assigned to H* itself—a posterior probability in H*1.
 Surely there are misinterpretations, but many were not
 What critics are doing is interpret a legitimate error
probability as a posterior in H*: SM Higgs

1

27
 Physicists did not assign a high probability to
H*: SM Higgs exists
(whatever it might mean
 Besides, many believe in beyond the standard model
physics.
 One may say informally, “so probably we have experimentally
demonstrated an SM-like Higgs”.
 When you hear: what they really want are posterior
probabilities, ask: How are we to interpret prior probabilities?
Posterior probabilities?

28
This is a great methodological controversy in practice that
philosophers of science and evidence should be in on
Our job is to clarify terms, is it not?

i



29

More Related Content

What's hot

Final mayo's aps_talk
Final mayo's aps_talkFinal mayo's aps_talk
Final mayo's aps_talkjemille6
 
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)jemille6
 
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"jemille6
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyjemille6
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...jemille6
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)jemille6
 
Senn repligate
Senn repligateSenn repligate
Senn repligatejemille6
 
Mayo & parker spsp 2016 june 16
Mayo & parker   spsp 2016 june 16Mayo & parker   spsp 2016 june 16
Mayo & parker spsp 2016 june 16jemille6
 
D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1jemille6
 
Mayo: Day #2 slides
Mayo: Day #2 slidesMayo: Day #2 slides
Mayo: Day #2 slidesjemille6
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019jemille6
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testingrishi.indian
 
Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13jemille6
 
hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overviewi i
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsificationjemille6
 
Null hypothesis AND ALTERNAT HYPOTHESIS
Null hypothesis AND ALTERNAT HYPOTHESISNull hypothesis AND ALTERNAT HYPOTHESIS
Null hypothesis AND ALTERNAT HYPOTHESISADESH MEDICAL COLLEGE
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesisJags Jagdish
 

What's hot (20)

Final mayo's aps_talk
Final mayo's aps_talkFinal mayo's aps_talk
Final mayo's aps_talk
 
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
 
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severely
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)
 
Senn repligate
Senn repligateSenn repligate
Senn repligate
 
Mayo & parker spsp 2016 june 16
Mayo & parker   spsp 2016 june 16Mayo & parker   spsp 2016 june 16
Mayo & parker spsp 2016 june 16
 
D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1
 
Mayo: Day #2 slides
Mayo: Day #2 slidesMayo: Day #2 slides
Mayo: Day #2 slides
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13Phil6334 day#4slidesfeb13
Phil6334 day#4slidesfeb13
 
hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overview
 
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and FalsificationP-Value "Reforms": Fixing Science or Threat to Replication and Falsification
P-Value "Reforms": Fixing Science or Threat to Replication and Falsification
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Null hypothesis AND ALTERNAT HYPOTHESIS
Null hypothesis AND ALTERNAT HYPOTHESISNull hypothesis AND ALTERNAT HYPOTHESIS
Null hypothesis AND ALTERNAT HYPOTHESIS
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
 

Viewers also liked

D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...jemille6
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationjemille6
 
An Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) TestingAn Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) Testingjemille6
 
Probability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis TestingProbability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis Testingjemille6
 
MacMAD Photos Presentation
MacMAD Photos PresentationMacMAD Photos Presentation
MacMAD Photos Presentationbos45
 
Bio-op Errors in DNA Computing
Bio-op Errors in DNA ComputingBio-op Errors in DNA Computing
Bio-op Errors in DNA Computingnamblasec
 
Litigating Flawed Forensic Science at Every Stage
Litigating Flawed Forensic Science at Every StageLitigating Flawed Forensic Science at Every Stage
Litigating Flawed Forensic Science at Every StageAdam Tebrugge
 
Wrongful Conviction Presentation
Wrongful Conviction PresentationWrongful Conviction Presentation
Wrongful Conviction PresentationMsRenee84
 
Crim 215 wrongful convictions
Crim 215 wrongful convictions Crim 215 wrongful convictions
Crim 215 wrongful convictions katherinehaberl
 
On the meaning of the likelihood ratio: is a large number always an indicati...
On the meaning of the likelihood ratio:  is a large number always an indicati...On the meaning of the likelihood ratio:  is a large number always an indicati...
On the meaning of the likelihood ratio: is a large number always an indicati...hindahaned
 
Disertación sobre el caso mediático de O. J. Simpson
Disertación sobre el caso mediático de O. J. SimpsonDisertación sobre el caso mediático de O. J. Simpson
Disertación sobre el caso mediático de O. J. SimpsonTavusFox
 
Forensic sciences and miscarriages of justice copia
Forensic sciences and miscarriages of justice   copiaForensic sciences and miscarriages of justice   copia
Forensic sciences and miscarriages of justice copiaRodrigo Omar Leyva
 
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~Kazuyuki Kato
 
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~Toshiya TSURU
 
Ethically Litigating Forensic Science Cases: Daubert, Dna and Beyond
Ethically Litigating Forensic Science Cases: Daubert, Dna and BeyondEthically Litigating Forensic Science Cases: Daubert, Dna and Beyond
Ethically Litigating Forensic Science Cases: Daubert, Dna and BeyondAdam Tebrugge
 
Laminas Exposicion Final Daniel Rojas Teatredo Criminalistica
Laminas Exposicion Final Daniel Rojas Teatredo CriminalisticaLaminas Exposicion Final Daniel Rojas Teatredo Criminalistica
Laminas Exposicion Final Daniel Rojas Teatredo Criminalisticadanielrojas1909
 
DNA Evidence with Ancestry
DNA Evidence with AncestryDNA Evidence with Ancestry
DNA Evidence with Ancestrybos45
 

Viewers also liked (20)

D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
 
An Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) TestingAn Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) Testing
 
Probability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis TestingProbability/Statistics Lecture Notes 4: Hypothesis Testing
Probability/Statistics Lecture Notes 4: Hypothesis Testing
 
MacMAD Photos Presentation
MacMAD Photos PresentationMacMAD Photos Presentation
MacMAD Photos Presentation
 
Bio-op Errors in DNA Computing
Bio-op Errors in DNA ComputingBio-op Errors in DNA Computing
Bio-op Errors in DNA Computing
 
Litigating Flawed Forensic Science at Every Stage
Litigating Flawed Forensic Science at Every StageLitigating Flawed Forensic Science at Every Stage
Litigating Flawed Forensic Science at Every Stage
 
Wrongful Conviction Presentation
Wrongful Conviction PresentationWrongful Conviction Presentation
Wrongful Conviction Presentation
 
Crim 215 wrongful convictions
Crim 215 wrongful convictions Crim 215 wrongful convictions
Crim 215 wrongful convictions
 
On the meaning of the likelihood ratio: is a large number always an indicati...
On the meaning of the likelihood ratio:  is a large number always an indicati...On the meaning of the likelihood ratio:  is a large number always an indicati...
On the meaning of the likelihood ratio: is a large number always an indicati...
 
Junk Science On Trial
Junk Science On TrialJunk Science On Trial
Junk Science On Trial
 
Disertación sobre el caso mediático de O. J. Simpson
Disertación sobre el caso mediático de O. J. SimpsonDisertación sobre el caso mediático de O. J. Simpson
Disertación sobre el caso mediático de O. J. Simpson
 
Forensic sciences and miscarriages of justice copia
Forensic sciences and miscarriages of justice   copiaForensic sciences and miscarriages of justice   copia
Forensic sciences and miscarriages of justice copia
 
Wrongful Convictions
Wrongful ConvictionsWrongful Convictions
Wrongful Convictions
 
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~
NetCommonsのユーザIDとパスワードでどこでもログイン~OpenIDの秘密~
 
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~
公式サイトとオンラインマニュアル(Wiki)の認証連携 ~ NetCommons meets MediaWiki with OpenID ~
 
Oj simpson
Oj simpsonOj simpson
Oj simpson
 
Ethically Litigating Forensic Science Cases: Daubert, Dna and Beyond
Ethically Litigating Forensic Science Cases: Daubert, Dna and BeyondEthically Litigating Forensic Science Cases: Daubert, Dna and Beyond
Ethically Litigating Forensic Science Cases: Daubert, Dna and Beyond
 
Laminas Exposicion Final Daniel Rojas Teatredo Criminalistica
Laminas Exposicion Final Daniel Rojas Teatredo CriminalisticaLaminas Exposicion Final Daniel Rojas Teatredo Criminalistica
Laminas Exposicion Final Daniel Rojas Teatredo Criminalistica
 
DNA Evidence with Ancestry
DNA Evidence with AncestryDNA Evidence with Ancestry
DNA Evidence with Ancestry
 

Similar to Mayo: 2nd half “Frequentist Statistics as a Theory of Inductive Inference” (Selection Effects)

Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxPage 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxkarlhennesey
 
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docx
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docxTopic Learning TeamNumber of Pages 2 (Double Spaced)Num.docx
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docxAASTHA76
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively jemille6
 
Common Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsCommon Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsClin Plus
 
How to read a paper
How to read a paperHow to read a paper
How to read a paperfaheta
 
Introduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailIntroduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailShriramKargaonkar
 
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”jemille6
 
importance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime Significanceimportance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime SignificanceSukumarReddy43
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1shoffma5
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testingpraveen3030
 
Hypothesis TestingThe Right HypothesisIn business, or an.docx
Hypothesis TestingThe Right HypothesisIn business, or an.docxHypothesis TestingThe Right HypothesisIn business, or an.docx
Hypothesis TestingThe Right HypothesisIn business, or an.docxadampcarr67227
 
Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversiesjemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 

Similar to Mayo: 2nd half “Frequentist Statistics as a Theory of Inductive Inference” (Selection Effects) (20)

Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxPage 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
 
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docx
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docxTopic Learning TeamNumber of Pages 2 (Double Spaced)Num.docx
Topic Learning TeamNumber of Pages 2 (Double Spaced)Num.docx
 
4_5875144622430228750.docx
4_5875144622430228750.docx4_5875144622430228750.docx
4_5875144622430228750.docx
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively
 
Common Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsCommon Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical Trials
 
How to read a paper
How to read a paperHow to read a paper
How to read a paper
 
Elements of inferential statistics
Elements of inferential statisticsElements of inferential statistics
Elements of inferential statistics
 
educ201.pptx
educ201.pptxeduc201.pptx
educ201.pptx
 
Introduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailIntroduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detail
 
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”"The Statistical Replication Crisis: Paradoxes and Scapegoats”
"The Statistical Replication Crisis: Paradoxes and Scapegoats”
 
importance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime Significanceimportance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime Significance
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis TestingThe Right HypothesisIn business, or an.docx
Hypothesis TestingThe Right HypothesisIn business, or an.docxHypothesis TestingThe Right HypothesisIn business, or an.docx
Hypothesis TestingThe Right HypothesisIn business, or an.docx
 
Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversies
 
HYPOTHESIS
HYPOTHESISHYPOTHESIS
HYPOTHESIS
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis
 

More from jemille6

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”jemille6
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilismjemille6
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfjemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfjemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inferencejemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?jemille6
 
What's the question?
What's the question? What's the question?
What's the question? jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metasciencejemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Twojemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testingjemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredgingjemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probabilityjemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...jemille6
 

More from jemille6 (20)

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
 

Recently uploaded

Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

Mayo: 2nd half “Frequentist Statistics as a Theory of Inductive Inference” (Selection Effects)

  • 1. 2nd half of “Frequentist Statistics as a Theory of Inductive Inference” (Selection Effects) The idealized formulation in the initial definition of a significance test starts with a hypothesis and a test statistic, obtains data, then applies the test and looks at the outcome:  The hypothetical procedure involved in the definition of the test then matches reasonably closely what was done;  The possible outcomes are the different possible values of the specified test statistic.  This permits features of the distribution of the test statistic to be relevant for learning about aspects of the mechanism generating the data. 1
  • 2. It often happens that either the null hypothesis or the test statistic are influenced by preliminary inspection of the data, so that the actual procedure generating the final test result is altered:*  This, may alter the capabilities of the test to detect discrepancies from the null hypotheses reliably, calling for adjustments in its error probabilities.  This is required to ensure that the p-values serve their intended purpose for frequentist inference, whether in behavioral or evidential contexts. * the objective of the test is to enable us to learn something about the underlying data generating mechanism, and this learning is made possible by correctly assessing the actual error probabilities. 2
  • 3. Ad hoc Hypotheses, Non-novel Data, Double-Counting, etc. The general point involved has been discussed extensively in both philosophical and statistical literatures.  In the former under such headings as requiring novelty or avoiding ad hoc hypotheses (use-constructions, etc.)  Under the latter, as rules against peeking at the data, shopping for significance, data mining, etc., for taking selection effects into account. (This will come up again throughout the semester. Optional stopping is an example of a data dependent strategy, as with “look elsewhere” effects in the Higgs research) These problems remain unresolved in general.. 3
  • 4. Error statistical considerations, coupled with a sound principle of inductive evidence, may allow going further by providing criteria for when various data dependent selections matter and how to take account of their influence on error probabilities. (some items of Mayo and Spanos: “How to discount double counting when it counts,” “Some surprising facts about surprising facts”, chapters 7,8,9 of EGEK, especially 8, “hunting without a license” Spanos) 4
  • 5. In particular, if the null hypothesis chosen for testing just because the test statistic is large, the probability of finding some such discordance or other may be high even under the null. Thus, following FEV, we would not have genuine evidence of inconsistency with the null, and unless the p-value is modified accordingly, the inference would be misleading. 5
  • 6. Example 1: Hunting for Statistical Significance Investigators have 20 independent sets of data, each reporting on different but closely related effects. After doing all 20 tests, with 20 nulls, H0i, i = 1, …20 they report only the smallest p-value, e.g., 0.05, and its corresponding null hypothesis, say H013. e.g., there is no difference between some treatment (a childhood training regimen) and a factor, f13 (some personality characteristic later in life). Passages from EGEK (Morrison and Henkel) 6
  • 7. This “hunting” procedure should be compared with a case where H013 was preset as the single hypothesis to test, and the small pvalue found.  In the hunting case, the possible results are the possible statistically significant factors that might be found to show a "calculated" statistical significant departure from the null. The relevant type 1 error probability is the probability of finding at least one such significant difference out of 20, even though the global null is true (i.e., all twenty observed differences are due to chance).  The probability that this procedure yields erroneous rejection differs from, and will be much greater than, 0.05 (and is approximately 0.64). 7
  • 8.  There are different, and indeed many more, ways one can err in this example than when one null is preset, and this is reflected in the adjusted p-value. My blog (reblogged March 3, 2014) Hardly a day goes by where I do not come across an article on the problems for statistical inference based on fallaciously capitalizing on chance: high-powered computer searches and “big” data trolling offer rich hunting grounds out of which apparently impressive results may be “cherry-picked”: When the hypotheses are tested on the same data that suggested them and when tests of significance are based on such data, then a spurious impression of validity may result. The computed level of significance may have almost no relation to the true level. . . . 8
  • 9. Suppose that twenty sets of differences have been examined, that one difference seems large enough to test and that this difference turns out to be “significant at the 5 percent level.” Does this mean that differences as large as the one tested would occur by chance only 5 percent of the time when the true difference is zero? The answer is no, because the difference tested has been selected from the twenty differences that were examined. The actual level of significance is not 5 percent, but 64 percent! (Selvin 1970, 104)[1] Critics of the Morrison and Henkel ilk clearly report that to ignore a variety of “selection effects” results in a fallacious computation of the actual significance level associated with a given inference; the “computed” or “nominal” significance level differs from the actual or warranted significance level. 9
  • 10. [1] Selvin calculates this approximately by considering the probability of finding at least one statistically significant difference at the .05 level when 20 independent samples are drawn from populations having true differences of zero, 1 – P (no such difference): 1 – (.95)20 = 1 – .36. This assumes, unrealistically, independent samples, but without that it may be unclear how to even approximately compute actual p-values. This influence on long-run error is well known, but should this influence the interpretation of the result in a context of inductive inference? According to frequentist or severity reasoning it should Not so easy to explain why: 10
  • 11. The concern is not the avoidance of often announcing genuine effects erroneously in a series, the concern is that this test performs poorly as a tool for discriminating genuine from chance effects in this case.  Because at least one such impressive departure, we know, is common even if all are due to chance, the test has scarcely reassured us that it has done a good job of avoiding such a mistake in this case.  Even if there are other grounds for believing the genuineness of the one effect that is found, we deny that this test alone has supplied such evidence. 11
  • 12. The "hunting procedure" does a very poor job in alerting us to, in effect, temper our enthusiasm, even where such tempering is warranted.  If the p-value is adjusted to reflect the actual error rate, the test again becomes a tool that serves this purpose. 12
  • 13. Example 2. Hunting for a Murderer (hunting for the source of a known effect by eliminative induction) Testing for a DNA match with a given specimen, known to be that of the murderer, a search through a data-base of possible matches is done one at a time. We are told, in a fairly well-known presentation of this case, that: P(DNA match; not murderer) = very small P(DNA match; murderer) ~ 1 The first individual, if any, from the data-base for which a match is found is declared to truly match the criminal, i.e., to be the murderer. 13
  • 14. (The null hypothesis, in effect, asserts that the person tested does NOT “match the criminal”; so the null is rejected iff there is an observed DNA match.) Example 2 is superficially similar to Example 1, finding a DNA match being somewhat akin to finding a statistically significant departure from a null hypothesis: one searches through data and concentrates on the one case where a "match" with the criminal's DNA is found, ignoring the non-matches. If one adjusts for "hunting" in Example 1, shouldn't one do so in broadly the same way in Example 2? 14
  • 15. No! (Although some have erroneously supposed frequentists say “yes”) In Example 1 the concern is inferring a genuine, “reproducible" effect, when in fact no such effect exists; In Example 2, there is a known effect or specific event, the criminal's DNA, and reliable procedures are used to track down the specific cause or source (as conveyed by the low "erroneousmatch" rate.) 15
  • 16.  The probability is high that we would not obtain a match with person i, if i were not the criminal; so, by FEV, finding the match is excellent evidence that i is the criminal. Moreover, each non-match found, by the stipulations of the example, virtually excludes that person; Note: the contrast in hunting for a DNA match is finding a match with the first person tested, as opposed to hunting through a data base  The more negative results found, the more the inferred "match" is fortified; whereas in Example 1 this is not so. 16
  • 17. 17
  • 18. Data-dependent Specification of distance or “cut-offs” (case 1) An analogy — The Texas Sharpshooter: testing a sharpshooter's ability by having him shoot and then drawing a bull's-eye around his results so as to yield the highest number of bull's-eyes,  The skill that one is allegedly testing and making inferences about is his ability to shoot when the target is given and fixed, while that is not the skill actually responsible for the resulting score. 18
  • 19. Case 2: By contrast, if the choice of specification is guided not by considerations of the statistical significance of departure from the original null hypothesis, but rather because one is an empirically adequate statistical model, the other violates assumptions, no adjustment for selection is called for.  Indeed, using a statistically adequate specification gives reassurance that the calculated p-value is relevant for interpreting the evidence reliably. 19
  • 20. Need for Adjustments for Data-Dependent Selections How does our conception of the frequentist theory of induction help to guide the answers? 1. It must be considered whether the context is one where the key concern is the control of error rates in a series of applications (behavioristic goal), or whether it is a context of evaluating specific evidence (inferential goal). The relevant error probabilities may be altered for the former context and not for the latter. 2. To determine the relevant hypothetical series on which to base error frequencies one must identify the particular obstacles that need to be avoided for a reliable inference in the particular case, 20
  • 21. and the capacity of the test, as a measuring instrument, to have revealed the presence of the obstacle. 21
  • 22. Statistics in the Discovery of the Higgs  Everyone was excited with the announced evidence for the discovery of a standard model (SM) Higgs particle based on a “5 sigma observed effect” (July 2012).  But because this report links to the use of statistical significance tests, some Bayesians raised criticisms 22
  • 23.  They want to ensure that before announcing the hypothesis H*: “a SM Higgs boson has been discovered” (with such and such properties) that H* has been given a severe run for its money That with extremely high probability we would have observed a smaller excess of signal-like events, were we in a universe where: H0: μ = 0 —background only hypothesis, vs. So, very probably H0 would have survived a cluster of tests, fortified with much cross-checking T, were μ = 0. 23
  • 24. Note what’s being given a high probability: Pr(test T would produce less than 5 sigma; H0) > 9999997. With probability .9999997, our methods would show that the bumps disappear (as so often occurred), under the assumption data are due to background H0. Assuming we want a posterior probability in H* seems to be a slide from the value of knowing this probability is high for assessing the warrant for H* Granted, this inference relies on an implicit severity principle of evidence. 24
  • 25. Data provide good evidence for inferring H (just) to the extent that H passes severely with x0, i.e., to the extent that H would (very probably) not have survived the test so well were H false. They then quantify various properties of the particle discovered (inferring ranges of magnitudes) 25
  • 26. The p-value police  Leading (subjective) Bayesian, Dennis Lindley had a letter sent around (to ISBA members)i:  Why demand such strong evidence?  (Could only be warranted if beliefs in the Higgs extremely low or costs of error exorbitant.)  Are they so wedded to frequentist methods? Lindley asks. “If so, has anyone tried to explain what bad science that is?” 26
  • 27.  Other critics rushed in to examine if reports (by journalists and scientists) misinterpreted the sigma levels as posterior probability assignments to the models.  Many critics have claimed that the .99999 was fallaciously being assigned to H* itself—a posterior probability in H*1.  Surely there are misinterpretations, but many were not  What critics are doing is interpret a legitimate error probability as a posterior in H*: SM Higgs 1 27
  • 28.  Physicists did not assign a high probability to H*: SM Higgs exists (whatever it might mean  Besides, many believe in beyond the standard model physics.  One may say informally, “so probably we have experimentally demonstrated an SM-like Higgs”.  When you hear: what they really want are posterior probabilities, ask: How are we to interpret prior probabilities? Posterior probabilities? 28
  • 29. This is a great methodological controversy in practice that philosophers of science and evidence should be in on Our job is to clarify terms, is it not? i  29