Reading Neyman's 1933

THE MOST EFFICIENT TESTS OF STATISTICAL HYPOTHESES

THE MOST EFFICIENT TESTS OF
STATISTICAL HYPOTHESES

BY Neyman Pearson

2013-01-28


Table of contents
1 Introductory


Table of contents
1 Introductory
2 Outline of General Theory
Two types errors of hypothesis testing
The likelihood principle


Table of contents
1 Introductory
3 Simple Hypotheses
General problems and solutions
Illustrate Examples


Table of contents
1 Introductory
3 Simple Hypotheses
Illustrate Examples
4 Composite Hypotheses-H0 has One Degree of Freedom
Illustrative Examples


Table of contents
1 Introductory
3 Simple Hypotheses
Illustrate Examples
5 Composite Hypotheses with C Degrees of Freedom


Table of contents
1 Introductory
3 Simple Hypotheses
Illustrate Examples
5 Composite Hypotheses with C Degrees of Freedom
6 Conclusion

Introductory

Introductory

The problem of testing statistical is an old one.We appear to find
disagreement between statisticians of ”Whether there is efficient
tests or not?”
In this article,we will talk about this problem and accept the words
” an efficient test of the hypothesis H0 ”. Which I have to refer is
that, here, ”efficient” means ”without hoping to know whether each
separate hypothesis is true or false, we may search for rules to givern
our behaviours with regard to them, in the long run of experience,
we shall not be too often wrong.”

Outline of General Theory



Deﬁnition (Type I error)
Occurs when the null hypothesis (H0 ) is true, but is rejected.

Deﬁnition (Type II error)
Occurs when the null hypothesis is false, but erroneously fails to be
rejected.

Table: summary of possible results of any hypothesis test

Reject H0 Don t reject H0
H0 T ype I Error Right decision
H1 Right decision T ype II Error


Suppose that the nature of an Event,E, is exactly described by
the values of n variable,

x1 , x2 , ...xn (2.1)

Suppose now that there exists a certain hypothesis, H0 , con-
cerning the origin of the event which is such as to determine
the probability of occurrence of every possible event E. Let

p0 = p0 (x1 , x2 , ...xn ) (2.2)

To obtain the probability that the event will give a sample point
falling into a particular region, say ω, we shall have either to
take the sum of (2.1) over all sample points included in ω, or
to calculate the integral,

P0 (ω) = ... p0 (x1 , x2 , ...xn )dx1 dx2 ...dxn (2.3)
ω


we shall assume that the sample points may fall anywhere within
a continuous sample space (which may be limited or not),which
we shall denote by W. It will follow that

P0 (W ) = 1 (2.4)


the criterion of likelihood

If H0 is a simple hypothesis,denoted Ω the set of all simple hy-
potheses.
Take any sample point,Σ, with co-ordinates (x1 , x2 , ...xn ) and con-
sider the set AΣ , of probabilities pH (x1 , x2 , ...xn ) corresponding to
this sample point and determined by diﬀerent simple hypotheses
belonging to Σ. We shall suppose that whatever be the sample
point the set AΣ is bounded. Denote pΩ (x1 , x2 , ...xn ) the upper
bound of the set AΣ , then if H0 determine the elementary probabil-
ity p0 (x1 , x2 , ...xn ), we have deﬁned its likelihood to be:

p0 (x1 , x2 , ...xn )
λ= (2.5)
pΩ (x1 , x2 , ...xn )


the criterion of likelihood

If H0 is a composite hypothesis, then it will be possible to specify
a part of the set Ω, say ω, such that every simple hypothesis
belonging to the sub-set ω
Analogously, we denote AΣ (ω)the sub-set of AΣ , corresponding to
the set ω of simple hypotheses belonging to H0 , then the likelihood
is
pω (x1 , x2 , ...xn )
λ= (2.6)
pΩ (x1 , x2 , ...xn )


The use of the principle of likelihood in testing hypotheses, consists
in accepting for critical regions those determined by the inequality:

λ ≤ C = const (2.7)


Example of sample space of two dimensions

Fig:2.1


Key Point
As we know,if there are may alternative tests for the same
hypothesis, the diﬀerence between them consists in the diﬀerence
in critical regions.That is to say, our problem would become
picking out the best critical regions for the same P0 (ω) = ε, the
solution will be the maximum of P1 (ω)

For example, assume ω1 and ω2 are two critical regions described in
the Fig:2.1 , if,
P1 (ω2 ) > P1 (ω1 )
Then, ω2 is better than ω1 .

Simple Hypotheses

Simple Hypotheses

Simple Hypotheses

We shall now consider how to ﬁnd the best critical region for H0 ,
with regard to a single alternative H1 , this will be the region ω0 for
which P1 (ω0 ) is a maximum subject to the condition that,

P0 (ω0 ) = (3.1)

Suppose the probability laws for H0 and H1 , namely, p0 (x1 , x2 , ...xn )
and p1 (x1 , x2 , ...xn ),exist, are continuous and not negative through-
out the whole sample space W ; further that,

P0 (W ) = P1 (W ) = 1 (3.2)

The problem will consist in ﬁnding an unconditioned minimum of
the expression

P0 (ω0 ) − k ∗ P1 (ω0 ) = ... {p0 (x1 , x2 , ...xn ) − k ∗ p1 (x1 , x2 , ...xn )}dx1 dx2 ...dxn
ω0
(3.3)
where k being a constant afterwards to be determined by the con-
dition(3.1)

Simple Hypotheses

We shall now show that the necessary and suﬃcient condition for a
region ω0 , being the best critical region for H0 , which is deﬁned by
the inequality,

p0 (x1 , x2 , ...xn ) ≤ k ∗ p1 (x1 , x2 , ...xn ) (3.4)

Simple Hypotheses

Denote by ω0 the region deﬁned by (3.4) and let ω1 be any other
region satisfying the condition P0 (ω1 ) = P0 (ω0 ) = ε. These regions
may have a common part ω01 , which represented in Fig:3.1

Fig:3.1

It follow that,
P0 (ω0 − ω01 ) = ε − P0 (ω01 ) = P0 (ω1 − ω01 ) (3.5)
and consequently,
k ∗ P1 (ω0 − ω01 ) ≥ P0 (ω0 − ω01 ) = P0 (ω1 − ω01 ) ≥ k ∗ P1 (ω1 − ω01 )
(3.6)
If we add k ∗ P1 (ω01 ) to both side,we obtain,
P1 (ω0 ) ≥ P1 (ω1 ) (3.7)

Simple Hypotheses
Illustrate Examples

Case of the Normal Distribution )Example 1
Suppose that it is known that a sample of n
individuals,x1 , x2 , ...xn , has been drawn randomly from some
normally distribution with standard deviation σ = σ0 . Let x and s
be the mean and standard deviation of the sample.
It is desired to test the hypothesis H0 that whether the mean in
the sampled population is a = a0 or not.
Then the probabilities of its occurrence determined by H0 and by
H1 will be,
(x − a0 )2 + s2
−n
1 2σ0 2
p0 (x1 , x2 , ...xn ) = ( √ )n e (3.8)
σ0 2π
(x − a1 )2 + s2
−n
1 2σ0 2
p1 (x1 , x2 , ...xn ) = ( √ )n e (3.9)
σ0 2π

Simple Hypotheses
Illustrate Examples

The criterion in this case becomes,

(x − a0 )2 + (x − a1 )2
−n
p0 2
2σ0
=e =k (3.10)
p1
From this it follows that the best critical region for H0 with regard
to H1 , deﬁned by the inequality (2.7),

1 σ2
(a0 − a1 )x ≤ (a2 − a2 ) + 0 log k = (a0 − a1 )x0 (say) (3.11)
0 1
2 n
The critical value for the best critical value will be,

x = x0 (3.12)

Simple Hypotheses
Illustrate Examples

Now two cases will ariseµ
(1)a1 < a0 , then the region is defined by, x ≤ x0
(2)a1 > a0 , then the region is defined by, x ≥ x0
where x0 the statistic which is easy to decide in the practice.

Conclusion
The test obtained by finding the best critical region is in fact the
ordinary test for the significance of a variation in the mean of a
sample; but the method of approach helps to bring out clearly the
relation of the two critical regions x ≤ x0 and x ≥ x0 . Further, the
test of this hypothesis could not be improved by using any other
form of criterion or critical region.

Simple Hypotheses
Illustrate Examples

Case of the Normal Distribution )Example 2
Suppose that it is known that a sample of n
individuals,x1 , x2 , ...xn , has been drawn randomly from some
normally distribution with mean a = a0 = 0. Let x and s be the
mean and standard deviation of the sample.
It is desired to test the hypothesis H0 that whether the standard
deviation in the sampled population is σ = σ0 or not.

Analogously,it is easy to show that the best critical region with regard
to H1 is deﬁned by the inequality,
n
1
(x2 )(σ0 − σ1 ) = (x2 + s2 )(σ0 − σ1 ) ≤ v 2 (σ0 − σ1 ) (3.13)
i
2 2 2 2 2 2
n
i=1

where v is a constant depending on ε, σ0 , σ1

Simple Hypotheses
Illustrate Examples

Again two cases will arise,
(1)σ1 < σ0 , then the region is defined by, x2 + s2 ≤ v 2
(2)σ1 > σ0 , then the region is defined by, x2 + s2 ≥ v 2

In this case, there are two ways to define the criterion,suppose that
m2 = x2 + s2 and ψ = χ2 .Then,

m2 = σ0 ψ/n, d = n
2
(3.14)

s2 = σ0 ψ/n − 1, d = n − 1
2
(3.15)

Simple Hypotheses
Illustrate Examples

It is of interest to compare the relative eﬃciency of the two criterions
m2 and s2 in reducing errors of the second type. If H0 is false,
suppose the true hypothesis to be H1 relating to a population in
which
σ1 = hσ0 > σ0 (3.16)
In testing H0 with regard to the class of alternatives for which σ >
σ0 , we should determine the critical value of ψ0 , so that
+∞
P0 (ψ ≥ ψ0 ) = p(ψ)dψ = ε (3.17)
ψ0

Simple Hypotheses
Illustrate Examples

When H1 is true, the Type II Error will be P1 (ψ ≤ ψ0 ), and this
value will be diﬀerent according to the two diﬀerent criterions. Now
suppose that ε = 0.01 and n=5, the position is shown in Fig:3.2

Fig:3.1

Simple Hypotheses
Illustrate Examples

(1)Using m2 , with degrees of freedom 5, ﬁnd that ψ0 = 15.086,
thus the Type II Error will be,

if h = 2, σ1 = 2σ0 , P1 (ψ ≤ ψ0 ) = 0.42
(3.18)
if h = 3, σ1 = 3σ0 , P1 (ψ ≤ ψ0 ) = 0.11

(2)Usings2 , with degrees of freedom 4, ﬁnd that ψ0 = 13.277, thus
the Type II Error will be,

if h = 2, σ1 = 2σ0 , P1 (ψ ≤ ψ0 ) = 0.49
(3.19)
if h = 3, σ1 = 3σ0 , P1 (ψ ≤ ψ0 ) = 0.17

Conclusion
The second test is better, and the choice of criterion can
distinguish test.

Composite Hypotheses-H0 has One Degree of Freedom

Composite Hypotheses-H0 has One
Degree of Freedom


Definition
Suppose that W a common sample space for any admissible
hypothesis Ht , p0 (x1 , x2 , ...xn ) is the probability law for this
sample who is dependent upon the value of c+d parameters
α(1) , α(2) , ...α(c) ; α(c+1) ...α(c+d) (d parameters are specified and c
unspecified), ω is a sub-set of simple hypothesis for testing a
composite hypothesis H0 , such that

P0 (ω) = ... p0 (x1 , x2 , ...xn )dx1 dx2 ...dxn = constant = ε
ω
(4.1)
and P0 (ω) is independent of all the values of α(1) , α(2) , ...α(c) .
We
shall speak of ω as a region of ”size” ε, similar to W with regard
to the c parameters.


Similar regions
We shall commence the problem with simple case for which the
degree of freedom of parameters is one.
we suppose that the set Ω of admissible hypotheses deﬁnes the
functional form of the probability law for a given sample, namely

p(x1 , x2 , ...xn ) (4.2)

but that this law is dependent upon the values of 1+d parameters
(2) (d+1)
α(1) ; α0 , ...α0 ; (4.3)

A composite hypothesis, H0 , of one degrees of freedom, its proba-
bility law is denoted by

p0 = p0 (x1 , x2 , ...xn ) (4.4)


Conditions of the problem of similar regions
We have been able to solve the problem of similar regions only under
very limiting conditions concerning p0 . These are as follows :
(a) p0 is indefinitely differentiable with regard to α(1) for all values
of α(1) and in every point of W, except perhaps in points forming a
∂ k p0
set of measure zero. That is to say, we suppose that exists
∂(α(1) )k
for any k = 1, 2, ...and is integrable over the region W.Denote by,
∂ log p0 1 ∂p0 ∂φ
φ= = ;φ = (4.5)
∂α(1) p0 ∂α(1) ∂α(1)
(b)The function p0 satisfies the equation

φ = A + Bφ, (4.6)

where the coefficients A and B are functions of α(1) but are inde-
pendent of x1 , x2 , ...xn .


Similar regions for this case

If the probability law p0 satisﬁes the two conditions (a) and (b),
then it follows that a necessary and suﬃcient condition for ω to be
similar to W with regard to α(1) is that

∂ k P0 (ω) ∂ k p0
= ... dx1 dx2 ...dxn , k = 1, 2, ... (4.7)
∂(α(1) )k ω ∂(α(1) )k

When k=1,
∂p0
= p0 φ (4.8)
∂α(1)
then,
∂P0 (ω)
= ... p0 φdx1 dx2 ...dxn = 0 (4.9)
∂α(1) ω


By the same way, when k=2

∂ 2 p0 ∂ 2
(1) )2
= (1)
(p0 φ) = p0 (φ2 + φ ), (4.10)
∂(α ∂α

∂ 2 P0 (ω)
= ... p0 (φ2 + φ )dx1 dx2 ...dxn = 0 (4.11)
∂(α(1) )2 ω

Using (4.6),we may have

∂ 2 P0 (ω)
= ... p0 (φ2 + A + Bφ)dx1 dx2 ...dxn = 0 (4.12)
∂(α(1) )2 ω

Having regard to (4.6) and (4.10), it follows that

... p0 φ2 dx1 dx2 ...dxn = −Aε = εψ2 (α(1) )(say) (4.13)
ω


When k=3, by diﬀerentiating(4.12)
∂ 3 P0 (ω)
= ... p0 (φ3 + 3Bφ2 + (3A + B 2 + B )φ + A + AB)dx1 dx2 ...dxn = 0
∂(α(1) )3 ω
(4.14)
Again,having regard to (4.6),(4.10) and (4.13), it follows that

... p0 φ3 dx1 dx2 ...dxn = (3AB−A −AB)ε = εψ3 (α(1) )(say)
ω
(4.15)
Using the method of induction, en general, we shall ﬁnd,

... p0 φk dx1 dx2 ...dxn = εψk (α(1) )(say) (4.16)
ω

where ψk is a function of α(1) but independent of the sample.
Since ω is similar to W, it follows that,
1
... p0 φk dx1 dx2 ...dxn = ... p0 φk dx1 dx2 ...dxn , k = 1, 2,
ε ω W
(4.17)


Now we consider the signiﬁcance above in the following way.
Suppose φ = constant = φ1
Then if
P0 (ω(φ)) = ... p0 dω(φ1 ) (4.18)
ω(φ1 )

P0 (W (φ)) = ... p0 dW (φ1 ) (4.19)
W (φ1 )

represent the integral of p0 taken over the common parts, ω(φ) and
W (φ) of φ = φ1 and ω and W respectively, it follows that if ω be
similar to W and of size ε,

P0 (ω(φ)) = εP0 (W (φ)) (4.20)


Choice of the best critical region

Let Ht be an alternative simple hypothesis to H0 . We shall assume
that regions similar to W with regard to α(1) do exist. Then ω0 , the
best critical region for H0 with regard to Ht , must be determined
to maximise

Pt (ω0 ) = ... p0 (x1 , x2 , ...xn )dx1 dx2 ...dxn (4.21)
ω

subject to the condition (4.20) holding for all values of φ.


Then the problem of ﬁnding the best critical region ω0 is reduce to
ﬁnd parts ω0 (φ) of W (φ), which will maximise P (ω(φ)) subject to
the condition (4.20).
This is the same problem that we have treated already when dealing
with the case of a simple hypothesis except that instead of the
regions ω0 and W, what we have are the regions ω0 (φ) and W (φ).
The inequality
pt ≥ k(φ)p0 (4.22)
will therefore determine the region ω0 (φ), where k(φ) is a constant
chosen subject to the condition (4.20)


Example 5
A sample of n has been drawn at random from normal population,
and H0 is the composite hypothesis that the mean in this
population is a = a0 , σ being unspeciﬁed.

(x − a)2 + s2
1 −n
p(x1 , x2 , ...xn ) = ( √ )n e 2σ 2 (4.23)
σ 2π
For H0 ,
α(1) = σ, α(2) = a0 (4.24)
∂p0 n (x − a)2 + s2
φ= =− +n (4.25)
∂σ σ 4σ 3
n (x − a)2 + s2 2n 3
φ = 2
+ 3n 4
=− 2 − φ (4.26)
σ σ σ σ
(1) (2)
H0 is an alternative hypothesis that αt = σ, αt = a0 .


It is seen that the condition pt ≥ kp0 , becomes

(n[x − a)2 + s2 ]) (n[x − a0 )2 + s2 ])
−
1 2σt2 1 −
ne ≥ k ne 2σ 2 (4.27)
σt σ

This can be reduces to,
1 2
x(at −a0 ) ≥ σ log k+0.5(a2 −a2 ) = (at −a0 )k1 (φ)(say) (4.28)
n t t 0

Two cases must be distinguished in determining ω0 (φ)
(a)at > a0 , then x ≥ k1 (φ)
(b)at < a0 , then x ≤ k1 (φ)
where k1 (φ) has to be chosen so that (4.20) is satisﬁed.


In the case n=3, the position of ω0 (φ) is indicated in Figµ4.1

Fig:4.1

Then there will be two families of these cones containing the best
critical regions:
(a)For the class of hypotheses at > a0 ; the cones will be in the
quadrant of positive x’s.
(b)For the class of hypotheses at < a0 ; the cones will be in the
quadrant of negative x’s.


conclusion
In the case n > 3,we may either appeal to the geometry of
multiple space,the calculation of this situation may be a little
complicated, but we can get similar result.Further, it has now been
shown that starting with information in the form supposed, there
can be no better test for the hypothesis under consideration.

Composite Hypotheses with C Degrees of Freedom

Composite Hypotheses with C
Degrees of Freedom


Similar regions
We shall now consider a probability function depending upon c pa-
rameters α,this will correspond to a composite hypothesis H0 , with
c degrees of freedom.
Deﬁnition
Let ω be any region in the sample space W,and denote by
P ({α(1) , α(2) , ...α(c) }ω) the integral of p(x1 , x2 , ...xn ) over the
(1) (2) (c)
region ω. Fix any system of values of the α’s,say αA , αA , ...αA ,
If the region ω has the property, that
(1) (2) (c)
P ({αA , αA , ...αA }ω) = ε = constant (5.1)

whatever be the system A, we shall say that it is similar to the
sample space W with regard to the set of parameters
α(1) , α(2) , ...α(c) and of size ε.



Definition
Let
fi (α, x1 , x2 , ...xn ) = Ci (i = 1, 2, ...k < n), (5.2)
be the equations of certain hypersurfaces, α and Ci being
parameters. Denote by S(α, C1 , C2 , ...Ck ) the intersection of these
hypersurfaces, and F (α) denote the family of S(α, C1 , C2 , ...Ck )
corresponding to a fixed values α’ and to different systems of
values of the C’s.Take any S(α(1) , C1 , C2 , ...Ck ) from any family
F (α1 ). If whatever be α2 it is possible to find suitable values of
the C’s, for example C1 , C2 , ...Ck , such that S(α(1) , C1 , C2 , ...Ck )
is identical with S(α(1) , C1 , C2 , ...Ck ), then we shall say that the
family F (α) is independent of α.


It is now possible to solve the problem of ﬁnding regions similar to
W with regard to a set of parameters α(1) , α(2) , ...α(c) , we are at
still only able to do so under rather limiting conditions.
∂ k p0
(a) We shall assume the existence of in every point of the
∂(α(i) )k
sample space.

(b)Writing

∂ log p0 1 ∂p0 ∂φi
φi = = ;φ = ; (5.3)
∂α(i) p0 ∂α(i) i ∂α(i)
it will be assumed that for every i = 1, 2, ... c.
φ i = Ai + B i φ i , (5.4)
Ai and Bi being independent of the x’s.


(c)Further there will need to be conditions concerning the φi =
const.Denote by S(α(i) , C1 , C2 , ...Ci−1 ) the intersection that φj =
Cj for any j=1,2,...i-1, corresponding to fixed values of the α’s and
C’s. F (α(i) ) will denote the family of S(α(i) , C1 , C2 , ...Ci−1 ) corre-
sponding to fixed values of the α’s and to different systems of values
of the C’s.
We shall assume that any family F (α(i) ) is independent of α(i) for
i =2, 3, ...c.


Similar regions

Property
If the above conditions are satisfied, then the necessary and
sufficient condition for ω being similar to W with regard to the set
of parameters α(1) , α(2) , ...α(c) , and of any given size ε is that

... pdω(φ1 , φ2 , ...φc ) = ε ... pdW (φ1 , φ2 , ...φc )
ω(φ1 ,φ2 ,...φc ) W (φ1 ,φ2 ,...φc )
(5.5)
where W (φ1 , φ2 , ...φc ) means the intersection in W of φi = Ci for
any j=1,2,...i-1, corresponding to fixed values of the α’s and C’s.
The symbol ω(φ1 , φ2 , ...φc ) means the part of W (φ1 , φ2 , ...φc )
included in ω.


The Determination of the best critical region

The choice of the best critical regions in this case is similar with we
already discuss before, it is simple to identify that best critical region
ω0 of size ε with regard to a simple alternative Ht must satisfy the
following conditions:
(1) ω0 must be similar to W with regard to the set of parameters
α(1) , α(2) , ...α(c) ,that is to say,

P0 (ω0 ) = ε (5.6)

must be independent of the values of the α’s.
(2)If ν be any other region of size ε similar to W with regard to the
same parameters,
Pt (ω0 ) ≥ Pt (ν) (5.7)


In this way the problem of ﬁnding the best critical region for testing
H0 is reduced to that of maximising

Pt (ω(φ1 , φ2 , ...φc )) (5.8)

under the condition (5.5) for every set of values

φ1 = C1 , φ2 = C2 , ...φc = Cc , (5.9)


Example 6 - Test for the significance of the difference between two variances

We suppose that two samples,
(1) Σ1 of size n1 ,mean = x1 ,standard deviation = s1 .
(2) Σ2 of size n2 ,mean = x2 ,standard deviation = s2 . have been drawn at random
from normal distribution.If this is so, the most general probability law for the observed
event may be written

(x1 − a1 )2 + s2 2 2
1 −n (x2 − a2 ) + s2
−n1 2
1 N 1 2σ 2 2σ 2
p(x1 , x2 , ...xn ; xn +1 , xn +2 , ...xN ) = ( √ ) n n e 1 2
1 1 1 2π σ 1σ 2
1 2
(5.10)

where n1 + n1 = N , and a1 , σ1 are the mean and standard deviation of the first, a2 ,
σ2 are the mean and standard deviation of the second sampled distribution. H0 is the
composite hypothesis that σ1 = σ2 . For an alternative Ht ,the parameters may be
defined as
σ2
α1 = a1 ; α2 = a2 − a1 = bt ; α3 = σ1 ; α4 = = θt . (5.11)
σ1


For the hypothesis to be tested, H0 :

α1 = a; α2 = b; α3 = σ; α4 = 1. (5.12)

H0 it will be seen, is a composite hypothesis with 3 degrees of
freedom, a, b and σ being unspeciﬁed.


We shall now consider whether the conditions (A), (B) and (C) of
the above theory are satisfied.
(A) The first condition is obviously satisfied.
(B) Making use of (5.10),we find that
√ 1
log p0 = −N log 2π − N log σ − {n1 (x1 − a)2 + n2 (x2 − a − b)2 + n1 s2 + n2 s2 }
1 2
2σ 2
(5.13)
∂ log p0 1
φ1 = = 2 {n1 (x1 − a) + n2 (x2 − a − b)} (5.14)
∂a σ
∂ log p0 1
φ2 = = 2 n2 (x2 − a − b) (5.15)
∂b σ
∂ log p0 N 1
φ3 = = − + N 3 {n1 (x1 − a)2 + n2 (x2 − a − b)2 + n1 s2 + n2 s2 } (5.16)
1 2
∂σ σ σ


We see that

φ1 = A1 , φ2 = A2 , φ3 = A3 + B3 φ3 , (5.17)

where the A’s and B’s are independent of the sample variates, so
that the condition (B) is satisﬁed.
(C) The equation S(α(2) , C1 ) namely,φ1 = C1 ,where C1 is an ar-
bitrary constant, is obviously equivalent to n1 x1 + n2 x2 = C1 de-
pending only upon one arbitrary parameter C1 . Hence F (α(2) ) is in-
dependent of α(2) . Similarly, F (α(3) ) is independent of α(3) . Hence
also the condition (C) is fulﬁlled.


We may now attempt to construct the best critical region ω0 .Its
elements ω0 (φ1 , φ2 , φ3 ) are parts of the W(φ1 , φ2 , φ3 ) satisfying the
system of three equations φi = Ci (i = 1, 2, 3), within which
pt ≥ k(x1 , x2 , sa )p0 (5.18)
1
where sa = n1 (s2 + n2 s2 ) and the value of k being determined
1 2
N
for each system of values of x1 , x2 , sa , so that
P0 (ω0 (φ1 , φ2 , φ3 )) = εP0 (W0 (φ1 , φ2 , φ3 )) (5.19)
The condition (5.19) becomes
−n1 [(x1 − a)2 + s2 ] + n2 [(x2 − a − b)2 + s2 ]
1 2
1
e 2σ 2 ≥
σN
−n1 [(x1 − a1 )2 + s2 ] + n2 θ2 [(x2 − a1 − b1 )2 + s2 ]
1 2
k 2σ12
N
e
σ1 θn2
(5.20)


Since the region determined by (5.19) will be similar to W with
regard to a, b and σ, we may put a = a1 , b = b1 , σ = σ1 , and the
condition (5.20) will be found on taking logarithms to reduce to

n2 {(x2 − a1 − b1 )2 + s2 }(1 − θ2 ) ≤ 2σ1 θ2 (log k − n2 log θ) (5.21)
2
2

This inequality contains only one variable s2 . Solving with regard
2
to s2 we find that the solution will depend upon the sign of the
2
difference 1 − θ2 . Accordingly we shall have to consider separately
the two classes of alternatives,
σ2
θ= > 1; the B.C.R will be defined by s2 ≥ k1 (x1 , x2 , sa )
2
σ1
σ2
θ= < 1; the B.C.R will be defined by s2 ≥ k2 (x1 , x2 , sa )
2
σ1


By some technical calculation, we can arrive at the criterion µ that,

σ2 n2 s 2
2
θ= > 1; µ = ≥ µ0 (5.22)
σ1 n1 s2 + n2 s2
1 2

σ2 n2 s 2
2
θ= < 1; µ = ≤ µ0 (5.23)
σ1 n1 s2 + n2 s2
1 2
where The constant µ0 depends only upon n1 , n2 value of ε chosen.

Conclusion
(1) We see that the best critical regions are common for all the
alternatives.
(2) the criterion µ we have reached,is equivalent to that suggested
on intuitive grounds by FISHER.

Conclusion

Conclusion
1.The conception of the best critical region we talk about in this
article,is the region with regard to the alternative hypothesis Ht ,
for a ﬁxed value of ε, assures the minimum chance of accepting
H0 , when the true hypothesis is Ht .
2.To solve the same problem in the case where the hypothesis
tested is composite, the solution of a further problem is
required;this consists in determining what has been called a region
similar to the sample space with regard to a parameter.
We have been able to solve this problem only under certain
limiting conditions; at present, therefore, these conditions also
restrict the generality of the solution given to the problem of the
best critical region for testing composite hypotheses.

Conclusion

Reference

[1] NEYMAN.”Contribution to the Theory of Certain Test Criteri-
a”,Bull.Inst.int.Statist 1928.
[2] FISHER.”Statistical methods for Research Workers”,London 1932.
[3] PEARSON and NEYMAN.”On the Problem of Two Samples”,Bull.
Acad. Polon. Sci. Lettres 1930.
[4] Lehmann EL.”The Fisher, Neyman and Pearson theories of test-
ing hypotheses: one theory or two? ” J Am Stat Assoc 1993.
[5] Berkson J.”Tests of signiﬁcance considered as evidence”, J Am
Stat Assoc 1942.

Conclusion

THANK YOU!

Reading Neyman's 1933

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Reading Neyman's 1933

Similar to Reading Neyman's 1933 (20)

More from Christian Robert

More from Christian Robert (20)

Recently uploaded

Recently uploaded (20)

Reading Neyman's 1933