(
Theory
)
1 A statistical hypothesis is
(
T
F
T
F
T
F
)a random variable a pivotal quantity
an assertion about unknown distribution of one or more random variables
2 (
∼
)Given arandom variable(absolutely)continue X F(x),F(x)unknown,siaX1,...,Xna sample from XandFn(x)the empirical distribution function.The Kolmogorov-Smirnovtest is usedtoverify
(
T
F
T
F
T
F
)the hypothesis H0 : X ∼ F0 vs Ha : H0 is false, where F0(x) is the hypothesized distribution for X the hypothesis H0 : X hasthe Kolmogorov distribution vs Ha : X hasthe Smirnov distribution the hypothesis H0 : X ∼ F0 vs Ha : H0 is false, using the test statistic sup |Fn(x) − F0(x)|
3 (
x
∈
R
) (
TV
F
TV
F
TV
F
TV
F
)Ina hypothesis testthep−valueisthe significance 1−γ
the value of test statistic
the probabilityof rejecting the true hypothesis
the smallest significancewith which you couldreject H0 when it’s true
4 (
TV
F
TV
F
TV
F
)In a hypothesis testthesignificance level αisthe value of test statistic
the probabilitywith which you could reject H0 when H0 is true
the probabilitywith which you could accept H0 when H0 is false
5
(
),
)For a given experimentare only and only possible,sincompatible resultsSi(i=1,...,sand s≥2), each one of hypothesized probabilityπi=P[Si]; .Repeating n times the experiment the result Si will be observed Ni times. Obviously . Let ni the observed value of Ni and let the chi-squared with s-1 degrees of freedom. Answer correctly.
(
TV
F
)the values represent the observed frequencies of the result Si
(
TV
F
)the statistic is approximately normal when n is big
TV
F
(
TV
F
)the condition is sufficient to ensure that
(
pag.
8
)
6 Given two hypothesis test simple,T1andT2to decide between the null hypothesis Hoand the alternativeHa,with error probability of 1stand 2ndtype respectively equal to α1, β1 for T1 and α2, β2 forT2, then T1 will be preferable to T2 if:
(
TV
F
TV
F
TV
F
TV
F
)α1 <α2 e β1 <β2 α1 = α2 e β1 >β2 α1 = α2 e β1 <β2 α1 >α2 e β1 <β2
(
Part B
)
1. Let U1 and U2 twouniform random variables on [0; 1] and independent.
1.1 To find the joint densityfU(u1,u2)of the vectorU=(U1,U2)Tand to draw the graph.
1.2 LetT=U1+U2.We denote by FT(t)the cumulative function of T.To show that, if 0≤t≤1,then FT(t)is the volume of parallelepiped in blue drawn in the picture. Calculate this volume.
Hint:Calculate P[U1+U2≤t]whereas the points(u1,u2)favorable are such that
{(u1,u2):0≤u1≤1;0≤u2≤1:0≤ u1 + u2 ≤t}.
1.3 Now, it will be easy to show that:
Calculate the probability that
Let now
0.523 0.556 0.994 1.000 1.052 1.087 1.154 1.322 1.485 1.549 (1)
10 observations from an unknown population X.
1.4 To write the sample distribution function Fn(x)concerning the observations(1).
We want to decide about the test (non parametric)
H0:FX(x)=FT(x) Ha : FX(x)
Using the techniques of Kolmogorov-Smirnov.
≠ FT(x) (2)
1.5 At this scope,calledxk; withk=1,...,10the ...
(Theory)1 A statistical hypothesis is (TFTFTF.docx
1. (
Theory
)
1 A statistical hypothesis is
(
T
F
T
F
T
F
)a random variable a pivotal quantity
an assertion about unknown distribution of one or more random
variables
2 (
∼
)Given arandom variable(absolutely)continue X
F(x),F(x)unknown,siaX1,...,Xna sample from
XandFn(x)the empirical distribution function.The Kolmogorov-
Smirnovtest is usedtoverify
(
T
F
T
F
T
F
)the hypothesis H0 : X ∼ F0 vs Ha : H0 is false, where F0(x) is
the hypothesized distribution for X the hypothesis H0 : X
hasthe Kolmogorov distribution vs Ha : X hasthe Smirnov
distribution the hypothesis H0 : X ∼ F0 vs Ha : H0 is false,
using the test statistic sup |Fn(x) − F0(x)|
3 (
x
2. ∈
R
) (
TV
F
TV
F
TV
F
TV
F
)Ina hypothesis testthep−valueisthe significance 1−γ
the value of test statistic
the probabilityof rejecting the true hypothesis
the smallest significancewith which you couldreject H0 when
it’s true
4 (
TV
F
TV
F
TV
F
)In a hypothesis testthesignificance level αisthe value of test
statistic
the probabilitywith which you could reject H0 when H0 is true
the probabilitywith which you could accept H0 when H0 is false
5
(
),
)For a given experimentare only and only
possible,sincompatible resultsSi(i=1,...,sand s≥2), each one of
3. hypothesized probabilityπi=P[Si]; .Repeating n times the
experiment the result Si will be observed Ni times. Obviously .
Let ni the observed value of Ni and let the chi-squared with s-1
degrees of freedom. Answer correctly.
(
TV
F
)the values represent the observed frequencies of the result Si
(
TV
F
)the statistic is approximately normal when n is big
TV
F
(
TV
F
)the condition is sufficient to ensure that
(
pag.
8
4. )
6 Given two hypothesis test simple,T1andT2to decide between
the null hypothesis Hoand the alternativeHa,with error
probability of 1stand 2ndtype respectively equal to α1, β1 for
T1 and α2, β2 forT2, then T1 will be preferable to T2 if:
(
TV
F
TV
F
TV
F
TV
F
)α1 <α2 e β1 <β2 α1 = α2 e β1 >β2 α1 = α2 e β1 <β2 α1 >α2 e
β1 <β2
(
Part B
)
1. Let U1 and U2 twouniform random variables on [0; 1] and
independent.
1.1 To find the joint densityfU(u1,u2)of the
vectorU=(U1,U2)Tand to draw the graph.
5. 1.2 LetT=U1+U2.We denote by FT(t)the cumulative function of
T.To show that, if 0≤t≤1,then FT(t)is the volume of
parallelepiped in blue drawn in the picture. Calculate this
volume.
Hint:Calculate P[U1+U2≤t]whereas the points(u1,u2)favorable
are such that
{(u1,u2):0≤u1≤1;0≤u2≤1:0≤ u1 + u2 ≤t}.
1.3 Now, it will be easy to show that:
Calculate the probability that
6. Let now
0.523 0.556 0.994 1.000 1.052 1.087
1.154 1.322 1.485 1.549 (1)
10 observations from an unknown population X.
1.4 To write the sample distribution function Fn(x)concerning
the observations(1).
7. We want to decide about the test (non parametric)
H0:FX(x)=FT(x) Ha : FX(x)
Using the techniques of Kolmogorov-Smirnov.
≠ FT(x) (2)
1.5 At this scope,calledxk; withk=1,...,10the observations,we
need first of all to complete the following table.
0.194
8. |FT (x6) − Fn(x5)|
0.002
|FT (x10) − 1|
1.6 Which of the values in the table is used to continue the test?
9. 1.7 Using the Kolmogorov tables of the random variable(to see
at the end of this assignment),todecide about the test(2)at a
significance level α=0.05.
2 The company Startrip declares that the distance that his new
Enterprise99 spaceship go through with 1 cubic meter (m3) of
liquid hydrogen is normally distributed with mean μ0 higher
than or equal to 20 parsecs (3.261563777 light years), and does
not communicate the variance. The New NASA, to verify the
Startrip declaration, shall obtain a preview of 5 different models
of this rare ship model and test them just outside the Milky
Way, noting the following average distance traveled (in parsecs)
with 1 m3 of liquid hydrogen:
(
15.7
18.6
20.1
21.5
19.1
).
2.1 Using these results you need to verify the Starship
declaration.H0:µ≥µ0=20vs the alternativeHa:µ<µ0,with a level
of significance α=1−γ=0.05.
At the time of the Enterprise99 official presentation, Startrip
announced that the mean μ0 of parsecs traveled with 1 m3 of
10. hydrogen is exactly 20 with variance σ2 = 4 parsec2. Captain
Kirk, who is interested to purchase an Enterprise99, heard that,
on average, these ships travel 15 parsecs per m3 of hydrogen.
He decides to buy an Enterprise99 (to accept the hypothesis H0:
μ = μ0 = 20 vs the alternative H1: μ = 15) if the mean of parsecs
traveled with 1 m3 of hydrogen from the first 50 Enterprise99
purchased by the Organization Intergalactic, exceeds 18.
2.2
If is the mean of the distances traveled by 50models purchased
by Intergalactic Organization, indicatewhat are the regions of
acceptance and rejection (i.e., where it’s need to drop the
observed sample meanto accept or reject H0)if this is the
strategy.
2.3 Calculate the probabilityαof the errorofthe 1st typeand βof
the2ndtypeif this is thestrategy.
2.4 If x50=18.7what error can commit Captain Kirk?
11. 3. William Sealy Gosset, tired of working for Guinness, where
must publish its statistical work secretly under the pseudonym
of Student, decided to quit and start to sell machinery to fill
bottles of beer. Its first customer is obviously the same
Guinness, which offers two different machines, M1 and M2,
which fill the bottles of beer with an average content of 33 cl,
but they do it with an accuracy that may not be the same.
Guinness, having to buy one of the two machines, compares the
performance of M1 and M2.
Let X the quantity of beer that M1 releases in a bottle and using
61 bottles for the test, we obtain the following results:
(
X
)x61 =33.04cl, s2 = 0.0862 cl2, nX =61
Let Y the quantity of beer that M2 releases in a bottleand using
41 bottles for the test, we obtain the following results:
(
Y
)y41 =33.07cl, s2 = 0.0885 cl2, nY =41
Where usually represent mean and sampling variances
observed.
Given the results of the observations, Guinness bets that M1 be
12. more precise than M2 (i.e. )
3.1 Wishing to verify with a test this bet, Guinnessprepares a
parametric test, taking as alternative hypothesis . Therefore,
indicate precisely the null hypothesis and the test statistic
writing the formulas in term of variances of X and Y, its
distribution (motivating the choice) and the relative degrees of
freedom.
3.2 Indicate the critical region of level α=1−γ,and the type of
test.
13. 3.3 Evaluate the p-value of the data for the introduced test and
derive a conclusion(hint:F0.75,40,60= 1.22).
3.4 The conclusion is strong or weak and why?
Guinness would acquire so M1. But Gosset, which is astatistical
expert, argues that M1 and M2 have the same accuracy.
Therefore, it suggests to the Guinness to use the data as before
for another test, where you takeas null hypothesis the fact that
M1 and M2 have the same precision (i.e. ).
14. .
3.5 Set, in this case, the statistical test, indicating the critical
region of levelα=1−γ.
3.6 Evaluate the p-value of the data for the introduced test and
derive a conclusion (hint: can be useful the value F0.5,60,40 =
1.0056).
3.7 The conclusion is strong or weak and why?
3.8 Considering that M1costs more than M2,which machine is
convenient for Guinness?
Partial Table of the 0.5-quantile of the Fisher
0.5
num
den
1
31. √
n
n
)Table of random variable K.
For a value of n <35 fixed and for a level 1 −γ of the test, the
cells at the crossroad n with 1 −γ contain the valueswith which
the observed valuesshould be compared. The last line refers to
the n values large: if n>35 we need to compare with the value
which is obtained by dividing the numerators of the fractions of
the last line per .
Partial table of t − Student.
df = n
t(.995)
t(.99)
t(.975)
t(.95)
t(.9)
t(.75)
1
63.657
31.821
12.706
6.314
3.078
1
2
9.925