SlideShare a Scribd company logo
1 of 31
Need to Reformulate Tests: P-values
Don’t Give an Effect Size
Severity function: SEV(Test T, data x, claim C)
• Tests are reformulated in terms of a discrepancy γ
from H0
• Instead of a binary cut-off (significant or not) the
particular outcome is used to infer discrepancies
that are or are not warranted
1
An Example of SEV (3.2 SIST)
1-sided normal testing
H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 100)
let significance level α = .025
Reject H0 whenever M ≥ 150 + 2σ/√n: M ≥ 152
M is the sample mean, its value is M0.
1SE = σ/√n = 1
2
Rejection rules:
Reject iff M > 150 + 2SE (N-P)
In terms of the P-value:
Reject iff P-value ≤ .025 (Fisher)
(P-value a distance measure, but inverted)
Let M = 152, so I reject H0.
3
PRACTICE WITH P-VALUES
Let M = 152
Z = (152 – 150)/1 = 2
The P-value is Pr(Z > 2) = .025
4
PRACTICE WITH P-VALUES
Let M = 151
Z = (151 – 150)/1 = 1
The P-value is Pr(Z > 1) = .16
(Note: SEV (μ > 150) = .84 = 1 – P-value for this test)
5
PRACTICE WITH P-VALUES
Let M = 150.5
Z = (150.5 – 150)/1 = .5
The P-value is Pr(Z > .5) = .3
6
PRACTICE WITH P-VALUES
Let M = 150
Z = (150 – 150)/1 = 0
The P-value is Pr(Z > 0) = .5
7
Frequentist Evidential Principle: FEV
FEV (i). x is evidence against H0 (i.e., evidence of
discrepancy from H0), if and only if the P-value Pr(d >
d0; H0) is very low (equivalently, Pr(d < d0; H0)= 1 - P
is very high).
8
Contraposing FEV(i) we get our minimal priniciple
FEV (ia) x are poor evidence against H0 (poor
evidence of discrepancy from H0), if there’s a high
probability the test would yield a more discordant
result, if H0 is correct.
Note the one-directional ‘if’ claim in FEV (1a)
(i) is not the only way x can be BENT.
9
H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 100)
The usual test infers there’s an indication of some
positive discrepancy from 150 because
Pr(M < 152: H0) = .97
Not very informative
Are we warranted in inferring μ > 153 say?
10
• Recall the complaint of the Likelihoodist
(p. 36)
• For them, inferring H1: μ > 150 means
every value in the alternative is more likely
than 150
• Our inferences are not to point values, but
we block inferences to discrepancies
beyond those warranted with severity.
11
Consider: How severely has μ > 153 passed the test?
SEV(μ > 153 ) (p. 143)
M = 152, as before, claim C: μ > 153
The data “accord with C” but there needs to be a
reasonable probability of a worse fit with C, if C is false
Pr(“a worse fit”; C is false)
Pr(M ≤ 152; μ ≤ 153)
Evaluate at μ = 153, as the prob is greater for μ < 153.
12
Consider: How severely has μ > 153 passed the test?
To get Pr(M ≤ 152: μ = 153), standardize:
Z = √100 (152- 153)/10 = -1
Pr(Z < -1) = .16 Terrible evidence
13
Consider: How severely has μ > 150 passed the test?
To get Pr(M ≤ 152: μ = 150), standardize:
Z = √100 (152- 150)/1 = 2
Pr(Z < 2) = .97
Notice it’s 1 – P-value
14
Now consider SEV(μ > 150.5) (still with M = 152)
Pr (A worse fit with C; claim is false) = .93
Pr(M < 152; μ = 150.5)
Z = (152 – 150.5) /1 = 1.5
Pr (Z < 1.5)= .93 Fairly good indication μ > 150.5
15
16
O Table 3.1
O Table 3.1
FOR PRACTICE:
Now consider SEV(μ > 151) (still with M = 152)
Pr (A worse fit with C; claim is false) = __
Pr(M < 152; μ = 151)
Z = (152 – 151) /1 = 1
Pr (Z < 1)= .84
17
MORE PRACTICE:
Now consider SEV(μ > 152) (still with M = 152)
Pr (A worse fit with C; claim is false) = __
Pr(M < 152; μ = 152)
Z = 0
Pr (Z < 0)= .5–important benchmark
Terrible evidence that μ > 152
Table 3.2 has exs with M = 153.
18
Using Severity to Avoid Fallacies:
Fallacy of Rejection: Large n
problem
• Fixing the P-value, increasing sample size n,
the cut-off gets smaller
• Get to a point where x is closer to the null than
various alternatives
• Many would lower the P-value requirement as n
increases-can always avoid inferring a
discrepancy beyond what’s warranted: 19
Severity tells us:
• an α-significant difference indicates less of a
discrepancy from the null if it results from larger (n1)
rather than a smaller (n2) sample size (n1 > n2 )
• What’s more indicative of a large effect (fire), a fire
alarm that goes off with burnt toast or one that
doesn’t go off unless the house is fully ablaze?
• [The larger sample size is like the one that goes off
with burnt toast] 20
(looks ahead) Compare n = 100 with n
= 10,000
H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 10,000)
Reject H0 whenever M ≥ 2SE: M ≥ 150.2
M is the sample mean (significance level = .025)
1SE = σ/√n = 10/√10,000 = .1
Let M = 150.2, so I reject H0.
21
Comparing n = 100 with n = 10,000
Reject H0 whenever M ≥ 2SE: M ≥ 150.2
SEV10,000(μ > 150.5) = 0.001
Z = (150.2 – 150.5) /.1 = -.3/.1 = -3
P(Z < -3) = .001
Corresponding 95% CI: [0, 150.4]
A .025 result is terrible indication μ > 150.5
When reached with n = 10,000
While SEV100(μ > 150.5) = 0.93
22
Non-rejection. Let M = 151, the test does not reject H0.
The standard formulation of N-P (as well as Fisherian)
tests stops there.
We want to be alert to a fallacious interpretation of a
“negative” result: inferring there’s no positive
discrepancy from μ = 150.
Is there evidence of compliance? μ ≤ 150?
The data “accord with” H0, but what if the test had little
capacity to have alerted us to discrepancies from 150?
23
No evidence against H0 is not evidence for it.
Condition (S-2) requires us to consider Pr(X > 151;
150), which is only .16.
24
P-value “moderate”
FEV(ii): A moderate p value is evidence of the absence
of a discrepancy γ from H0, only if there is a high
probability the test would have given a worse fit with H0
(i.e., smaller P- value) were a discrepancy γ to exist.
For a Fisherian like Cox, a test’s power only has
relevance pre-data, they can measure “sensitivity”.
In the Neyman-Pearson theory of tests, the
sensitivity of a test is assessed by the notion of
power, defined as the probability of reaching a preset
level of significance …for various alternative
hypotheses. In the approach adopted here the
assessment is via the distribution of the random
variable P, again considered for various alternatives
(Cox 2006, p. 25)
25
Computation for SEV(T, M = 151, C: μ ≤ 150)
Z = (151 – 150)/1 = 1
Pr(Z > 1) = .16
SEV(C: μ ≤ 150) = low (.16).
• So there’s poor indication of H0
26
Can they say M = 151 is a good indication that μ ≤
150.5?
No, SEV(T, M = 151, C: μ ≤ 150.5) = ~.3.
[Z = 151 – 150.5 = .5]
But M = 151 is a good indication that μ ≤ 152
[Z = 151 – 152 = -1; Pr (Z > -1) = .84 ]
SEV(μ ≤ 152) = .84
It’s an even better indication μ ≤ 153 (Table 3.3, p. 145)
[Z = 151 – 153 = -2; Pr (Z > -2) = .97 ]
27
Π(γ): “sensitivity function”
Computing Π(γ) views the P-value as a statistic.
Π(γ) = Pr(P < pobs; µ0 + γ).
The alternative µ1 = µ0 + γ.
Given that P-value inverts the distance, it is less
confusing to write Π(γ)
Π(γ) = Pr(d > d0; µ0 + γ).
Compare to the power of a test:
POW(γ) = Pr(d > cα; µ0 + γ) the N-P cut-off cα.
28
FEV(ii) in terms of Π(γ)
P-value is modest (not small): Since the data accord
with the null hypothesis, FEV directs us to examine
the probability of observing a result more discordant
from H0 if µ = µ0 +γ:
If Π(γ) = Pr(d > d0; µ0 + γ) is very high, the data
indicate that µ< µ0 +γ.
Here Π(γ) gives the severity with which the test has
probed the discrepancy γ.
29
FEV (ia) in terms of Π(γ)
If Π(γ) = Pr(d > do; µ0 +γ) = moderately high (greater
than .3, .4, .5), then there’s poor grounds for
inferring µ > µ0 + γ.
This is equivalent to saying the SEV(µ > µ0 +γ) is
poor.
30
FEV/SEV (for Excur 3 Tour III)
Test T+: Normal testing: H0: µ < µ0 vs. H1: µ > µ0
σ known
(FEV/SEV): If d(x) is statistically significant (P- value
very small), then test T+ passes µ > M0 – k σ/√n with
severity (1 – ).
(FEV/SEV): If d(x) is not statistically significant (P- value
moderate), then test T+ passes µ < M0 + k σ/√n with
severity ( 1 – ),
where P(d(X) > k) = .
31

More Related Content

What's hot

4 2 continuous probability distributionn
4 2 continuous probability    distributionn4 2 continuous probability    distributionn
4 2 continuous probability distributionnLama K Banna
 
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper ConceptsExcursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper Conceptsjemille6
 
Probability distribution
Probability distributionProbability distribution
Probability distributionPunit Raut
 
Machine learning (9)
Machine learning (9)Machine learning (9)
Machine learning (9)NYversity
 
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...Dr.Summiya Parveen
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and DistributionEugene Yan Ziyou
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationjemille6
 
Probability distribution
Probability distributionProbability distribution
Probability distributionRanjan Kumar
 
Lesson 12: Linear Approximations and Differentials (slides)
Lesson 12: Linear Approximations and Differentials (slides)Lesson 12: Linear Approximations and Differentials (slides)
Lesson 12: Linear Approximations and Differentials (slides)Matthew Leingang
 
6334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-26334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-2jemille6
 
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Christian Robert
 
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium Jie Bao
 

What's hot (20)

Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
4 2 continuous probability distributionn
4 2 continuous probability    distributionn4 2 continuous probability    distributionn
4 2 continuous probability distributionn
 
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper ConceptsExcursion 3 Tour III, Capability and Severity: Deeper Concepts
Excursion 3 Tour III, Capability and Severity: Deeper Concepts
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Probability distributionv1
Probability distributionv1Probability distributionv1
Probability distributionv1
 
Machine learning (9)
Machine learning (9)Machine learning (9)
Machine learning (9)
 
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
 
Slides mc gill-v4
Slides mc gill-v4Slides mc gill-v4
Slides mc gill-v4
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Lesson 12: Linear Approximations and Differentials (slides)
Lesson 12: Linear Approximations and Differentials (slides)Lesson 12: Linear Approximations and Differentials (slides)
Lesson 12: Linear Approximations and Differentials (slides)
 
6334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-26334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-2
 
Stats chapter 8
Stats chapter 8Stats chapter 8
Stats chapter 8
 
Discrete probability distributions
Discrete probability distributionsDiscrete probability distributions
Discrete probability distributions
 
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
 
Slides mc gill-v3
Slides mc gill-v3Slides mc gill-v3
Slides mc gill-v3
 
practive of finance
practive of financepractive of finance
practive of finance
 
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium
Beyond Nash Equilibrium - Correlated Equilibrium and Evolutionary Equilibrium
 
Binomia ex 2
Binomia ex 2Binomia ex 2
Binomia ex 2
 

Similar to Mayo Slides Feb. 27: Need to Reformulate Tests

Probability and Statistics Exam Help
Probability and Statistics Exam HelpProbability and Statistics Exam Help
Probability and Statistics Exam HelpStatistics Exam Help
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptxMinilikDerseh1
 
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populationsSolution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populationsLong Beach City College
 
Normal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionNormal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionQ Dauh Q Alam
 
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Christian Robert
 
Statistics-Non parametric test
Statistics-Non parametric testStatistics-Non parametric test
Statistics-Non parametric testRabin BK
 
Theory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfTheory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfHnCngnh1
 
hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptxdangwalakash07
 
Probability and Some Special Discrete Distributions
Probability and Some Special Discrete DistributionsProbability and Some Special Discrete Distributions
Probability and Some Special Discrete DistributionsDoyelGhosh1
 
Econometrics chapter 5-two-variable-regression-interval-estimation-
Econometrics chapter 5-two-variable-regression-interval-estimation-Econometrics chapter 5-two-variable-regression-interval-estimation-
Econometrics chapter 5-two-variable-regression-interval-estimation-Alamin Milton
 
Chapter4
Chapter4Chapter4
Chapter4Vu Vo
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
 

Similar to Mayo Slides Feb. 27: Need to Reformulate Tests (20)

Probability and Statistics Assignment Help
Probability and Statistics Assignment HelpProbability and Statistics Assignment Help
Probability and Statistics Assignment Help
 
Probability and Statistics Exam Help
Probability and Statistics Exam HelpProbability and Statistics Exam Help
Probability and Statistics Exam Help
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptx
 
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populationsSolution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Statistics Assignment Help
Statistics Assignment HelpStatistics Assignment Help
Statistics Assignment Help
 
Normal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson DistributionNormal Distribution, Binomial Distribution, Poisson Distribution
Normal Distribution, Binomial Distribution, Poisson Distribution
 
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
Tutorial on testing at O'Bayes 2015, Valencià, June 1, 2015
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
Statistics-Non parametric test
Statistics-Non parametric testStatistics-Non parametric test
Statistics-Non parametric test
 
Theory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdfTheory of uncertainty of measurement.pdf
Theory of uncertainty of measurement.pdf
 
probability assignment help
probability assignment helpprobability assignment help
probability assignment help
 
hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptx
 
Probability and Some Special Discrete Distributions
Probability and Some Special Discrete DistributionsProbability and Some Special Discrete Distributions
Probability and Some Special Discrete Distributions
 
Econometrics chapter 5-two-variable-regression-interval-estimation-
Econometrics chapter 5-two-variable-regression-interval-estimation-Econometrics chapter 5-two-variable-regression-interval-estimation-
Econometrics chapter 5-two-variable-regression-interval-estimation-
 
Lemh1a1
Lemh1a1Lemh1a1
Lemh1a1
 
Lemh1a1
Lemh1a1Lemh1a1
Lemh1a1
 
Chapter4
Chapter4Chapter4
Chapter4
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
 
Hypothesis testing 47
Hypothesis testing 47Hypothesis testing 47
Hypothesis testing 47
 

More from jemille6

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”jemille6
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilismjemille6
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfjemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfjemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inferencejemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?jemille6
 
What's the question?
What's the question? What's the question?
What's the question? jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metasciencejemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Twojemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testingjemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredgingjemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probabilityjemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (jemille6
 

More from jemille6 (20)

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 

Recently uploaded

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...EADTU
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaEADTU
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................MirzaAbrarBaig5
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxLimon Prince
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital ManagementMBA Assignment Experts
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesPooky Knightsmith
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSean M. Fox
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
Graduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptxGraduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptxneillewis46
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...Nguyen Thanh Tu Collection
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFVivekanand Anglo Vedic Academy
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...Gary Wood
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxMarlene Maheu
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....Ritu480198
 

Recently uploaded (20)

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical Principles
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
Graduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptxGraduate Outcomes Presentation Slides - English (v3).pptx
Graduate Outcomes Presentation Slides - English (v3).pptx
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
VAMOS CUIDAR DO NOSSO PLANETA! .
VAMOS CUIDAR DO NOSSO PLANETA!                    .VAMOS CUIDAR DO NOSSO PLANETA!                    .
VAMOS CUIDAR DO NOSSO PLANETA! .
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 

Mayo Slides Feb. 27: Need to Reformulate Tests

  • 1. Need to Reformulate Tests: P-values Don’t Give an Effect Size Severity function: SEV(Test T, data x, claim C) • Tests are reformulated in terms of a discrepancy γ from H0 • Instead of a binary cut-off (significant or not) the particular outcome is used to infer discrepancies that are or are not warranted 1
  • 2. An Example of SEV (3.2 SIST) 1-sided normal testing H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 100) let significance level α = .025 Reject H0 whenever M ≥ 150 + 2σ/√n: M ≥ 152 M is the sample mean, its value is M0. 1SE = σ/√n = 1 2
  • 3. Rejection rules: Reject iff M > 150 + 2SE (N-P) In terms of the P-value: Reject iff P-value ≤ .025 (Fisher) (P-value a distance measure, but inverted) Let M = 152, so I reject H0. 3
  • 4. PRACTICE WITH P-VALUES Let M = 152 Z = (152 – 150)/1 = 2 The P-value is Pr(Z > 2) = .025 4
  • 5. PRACTICE WITH P-VALUES Let M = 151 Z = (151 – 150)/1 = 1 The P-value is Pr(Z > 1) = .16 (Note: SEV (μ > 150) = .84 = 1 – P-value for this test) 5
  • 6. PRACTICE WITH P-VALUES Let M = 150.5 Z = (150.5 – 150)/1 = .5 The P-value is Pr(Z > .5) = .3 6
  • 7. PRACTICE WITH P-VALUES Let M = 150 Z = (150 – 150)/1 = 0 The P-value is Pr(Z > 0) = .5 7
  • 8. Frequentist Evidential Principle: FEV FEV (i). x is evidence against H0 (i.e., evidence of discrepancy from H0), if and only if the P-value Pr(d > d0; H0) is very low (equivalently, Pr(d < d0; H0)= 1 - P is very high). 8
  • 9. Contraposing FEV(i) we get our minimal priniciple FEV (ia) x are poor evidence against H0 (poor evidence of discrepancy from H0), if there’s a high probability the test would yield a more discordant result, if H0 is correct. Note the one-directional ‘if’ claim in FEV (1a) (i) is not the only way x can be BENT. 9
  • 10. H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 100) The usual test infers there’s an indication of some positive discrepancy from 150 because Pr(M < 152: H0) = .97 Not very informative Are we warranted in inferring μ > 153 say? 10
  • 11. • Recall the complaint of the Likelihoodist (p. 36) • For them, inferring H1: μ > 150 means every value in the alternative is more likely than 150 • Our inferences are not to point values, but we block inferences to discrepancies beyond those warranted with severity. 11
  • 12. Consider: How severely has μ > 153 passed the test? SEV(μ > 153 ) (p. 143) M = 152, as before, claim C: μ > 153 The data “accord with C” but there needs to be a reasonable probability of a worse fit with C, if C is false Pr(“a worse fit”; C is false) Pr(M ≤ 152; μ ≤ 153) Evaluate at μ = 153, as the prob is greater for μ < 153. 12
  • 13. Consider: How severely has μ > 153 passed the test? To get Pr(M ≤ 152: μ = 153), standardize: Z = √100 (152- 153)/10 = -1 Pr(Z < -1) = .16 Terrible evidence 13
  • 14. Consider: How severely has μ > 150 passed the test? To get Pr(M ≤ 152: μ = 150), standardize: Z = √100 (152- 150)/1 = 2 Pr(Z < 2) = .97 Notice it’s 1 – P-value 14
  • 15. Now consider SEV(μ > 150.5) (still with M = 152) Pr (A worse fit with C; claim is false) = .93 Pr(M < 152; μ = 150.5) Z = (152 – 150.5) /1 = 1.5 Pr (Z < 1.5)= .93 Fairly good indication μ > 150.5 15
  • 16. 16 O Table 3.1 O Table 3.1
  • 17. FOR PRACTICE: Now consider SEV(μ > 151) (still with M = 152) Pr (A worse fit with C; claim is false) = __ Pr(M < 152; μ = 151) Z = (152 – 151) /1 = 1 Pr (Z < 1)= .84 17
  • 18. MORE PRACTICE: Now consider SEV(μ > 152) (still with M = 152) Pr (A worse fit with C; claim is false) = __ Pr(M < 152; μ = 152) Z = 0 Pr (Z < 0)= .5–important benchmark Terrible evidence that μ > 152 Table 3.2 has exs with M = 153. 18
  • 19. Using Severity to Avoid Fallacies: Fallacy of Rejection: Large n problem • Fixing the P-value, increasing sample size n, the cut-off gets smaller • Get to a point where x is closer to the null than various alternatives • Many would lower the P-value requirement as n increases-can always avoid inferring a discrepancy beyond what’s warranted: 19
  • 20. Severity tells us: • an α-significant difference indicates less of a discrepancy from the null if it results from larger (n1) rather than a smaller (n2) sample size (n1 > n2 ) • What’s more indicative of a large effect (fire), a fire alarm that goes off with burnt toast or one that doesn’t go off unless the house is fully ablaze? • [The larger sample size is like the one that goes off with burnt toast] 20
  • 21. (looks ahead) Compare n = 100 with n = 10,000 H0: μ ≤ 150 vs. H1: μ > 150 (Let σ = 10, n = 10,000) Reject H0 whenever M ≥ 2SE: M ≥ 150.2 M is the sample mean (significance level = .025) 1SE = σ/√n = 10/√10,000 = .1 Let M = 150.2, so I reject H0. 21
  • 22. Comparing n = 100 with n = 10,000 Reject H0 whenever M ≥ 2SE: M ≥ 150.2 SEV10,000(μ > 150.5) = 0.001 Z = (150.2 – 150.5) /.1 = -.3/.1 = -3 P(Z < -3) = .001 Corresponding 95% CI: [0, 150.4] A .025 result is terrible indication μ > 150.5 When reached with n = 10,000 While SEV100(μ > 150.5) = 0.93 22
  • 23. Non-rejection. Let M = 151, the test does not reject H0. The standard formulation of N-P (as well as Fisherian) tests stops there. We want to be alert to a fallacious interpretation of a “negative” result: inferring there’s no positive discrepancy from μ = 150. Is there evidence of compliance? μ ≤ 150? The data “accord with” H0, but what if the test had little capacity to have alerted us to discrepancies from 150? 23
  • 24. No evidence against H0 is not evidence for it. Condition (S-2) requires us to consider Pr(X > 151; 150), which is only .16. 24
  • 25. P-value “moderate” FEV(ii): A moderate p value is evidence of the absence of a discrepancy γ from H0, only if there is a high probability the test would have given a worse fit with H0 (i.e., smaller P- value) were a discrepancy γ to exist. For a Fisherian like Cox, a test’s power only has relevance pre-data, they can measure “sensitivity”. In the Neyman-Pearson theory of tests, the sensitivity of a test is assessed by the notion of power, defined as the probability of reaching a preset level of significance …for various alternative hypotheses. In the approach adopted here the assessment is via the distribution of the random variable P, again considered for various alternatives (Cox 2006, p. 25) 25
  • 26. Computation for SEV(T, M = 151, C: μ ≤ 150) Z = (151 – 150)/1 = 1 Pr(Z > 1) = .16 SEV(C: μ ≤ 150) = low (.16). • So there’s poor indication of H0 26
  • 27. Can they say M = 151 is a good indication that μ ≤ 150.5? No, SEV(T, M = 151, C: μ ≤ 150.5) = ~.3. [Z = 151 – 150.5 = .5] But M = 151 is a good indication that μ ≤ 152 [Z = 151 – 152 = -1; Pr (Z > -1) = .84 ] SEV(μ ≤ 152) = .84 It’s an even better indication μ ≤ 153 (Table 3.3, p. 145) [Z = 151 – 153 = -2; Pr (Z > -2) = .97 ] 27
  • 28. Π(γ): “sensitivity function” Computing Π(γ) views the P-value as a statistic. Π(γ) = Pr(P < pobs; µ0 + γ). The alternative µ1 = µ0 + γ. Given that P-value inverts the distance, it is less confusing to write Π(γ) Π(γ) = Pr(d > d0; µ0 + γ). Compare to the power of a test: POW(γ) = Pr(d > cα; µ0 + γ) the N-P cut-off cα. 28
  • 29. FEV(ii) in terms of Π(γ) P-value is modest (not small): Since the data accord with the null hypothesis, FEV directs us to examine the probability of observing a result more discordant from H0 if µ = µ0 +γ: If Π(γ) = Pr(d > d0; µ0 + γ) is very high, the data indicate that µ< µ0 +γ. Here Π(γ) gives the severity with which the test has probed the discrepancy γ. 29
  • 30. FEV (ia) in terms of Π(γ) If Π(γ) = Pr(d > do; µ0 +γ) = moderately high (greater than .3, .4, .5), then there’s poor grounds for inferring µ > µ0 + γ. This is equivalent to saying the SEV(µ > µ0 +γ) is poor. 30
  • 31. FEV/SEV (for Excur 3 Tour III) Test T+: Normal testing: H0: µ < µ0 vs. H1: µ > µ0 σ known (FEV/SEV): If d(x) is statistically significant (P- value very small), then test T+ passes µ > M0 – k σ/√n with severity (1 – ). (FEV/SEV): If d(x) is not statistically significant (P- value moderate), then test T+ passes µ < M0 + k σ/√n with severity ( 1 – ), where P(d(X) > k) = . 31

Editor's Notes

  1. Even larger for μ < 150.5
  2. Even larger for μ < 150.5
  3. Even larger for μ < 150.5
  4. Even larger for μ < 150.5
  5. Even larger for μ < 150.5
  6. Even larger for μ < 150.5
  7. Even larger for μ < 150.5
  8. (source of Bayes/Fisher disagreement)
  9. Even larger for μ < 150.5
  10. Even larger for μ < 150.5
  11. Even larger for μ < 150.5
  12. Even larger for μ < 150.5