1Presented by: Group-E (The Anonymous)
Gagan Puri
Bikram Bhurtel
Rabin B.K
Bimal Pradhan
What is parametric test?
What is non-parametric test?
Difference between parametric and non-parametric test
Chi-square test (χ2)
Run test
Sign test
Kolmogorov Smirnov test
Cochran Q. test
Friedman F. test
References
2
● Parametric tests involve estimation parameters such as the mean
and assume that distribution of sample means are normally
distributed
● Non parametric tests were developed for these situations where
fewer assumptions have to be made
● Some times called distribution free tests
● There are some assumptions in nonparametric test but are of less
stringent
3
S/n Parametric test S/n Non parametric test
1 It specifies certain condition about
parameter of the population from which
sample is selected.
1 It doesn't specifies certain condition about
parameter of the population from which
sample is selected.
2 It is used in testing of hypothesis and
estimation of parameters.
2 It is used in testing of hypothesis but not in
estimation of parameters.
3 Mostly it is used in data measured in
interval and ratio scale.
3 It is used in data measured in nominal and
ordinal scale.
4 It is most powerful 4 It is less powerful
5 It requires complicated sampling
technique
5 It doesn’t require complicated sampling
techinque.
Difference between parametric and non-parametric test
➢Used to test significant difference between observed frequencies and
expected frequencies
➢Let ‘n’ be the observation of the random variable x and classified k (no)
of classes WRT frequencies
➢ Problem to test:
➢
Chi-square test
5
𝐻1: 𝑂𝑖 ≠ 𝐸𝑖
𝐻1: 𝑂𝑖 = 𝐸𝑖
χ2
=
𝑖=1
𝑘
𝑂𝑖 − 𝐸𝑖
2
𝐸𝑖
∼ χ2 𝑘−1
Test statistics
If be the the level of significance and (k-1) be degree of freedom then
critical value is obtained from table
Decision:
then we reject the null hypothesis otherwise accepted
α
If χ 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑
2
> χ 𝑡𝑎𝑏𝑢𝑙𝑡𝑒𝑑
2
 Sequence of letter of one kind bounded by letter of other kind is called run
++++++++----------+++++++-------+++---
here no. of run = 6
 Test used for testing the randomness of sequence of sample event
Problem to test:-
Hₒ: The sequence are in random order
Hˌ: The sequence are not in random order
7
For small sample size (n1,n2 ≤ 20)
Test statistics:
number of runs (r)
Level of significance:
α=0.05 unless we are given
Critical value:
ṟ and ȓ obtained from the table according to α, degree of freedom n1 and n2 and
alternative hypothesis
Decision:
accept Hₒ at a level of significance if rϵ(ṟ,ȓ), reject otherwise
8
Example:
HH TTT H T HHH TT HH T
no. of heads (n1)=8, no of tails (n2)= 7, no. of runs (r)=8
Problem to test:
Hₒ : the sequence are in random order
Hˌ : the sequence are not in random order
Test statistics:
r=8
Critical value:
at α=0.05, n1=8 and n2=7, ṟ=4 and ȓ=13
Decision : r=8 ϵ (ṟ=4, ȓ=13), accept Hₒ
Conclusion : the sequence of H and T are in random order.
9
10
For large sample size (n1, n2 ≥ 20)
r is approximately normally distributed with
mean, և =
2n1n2
(n1+n2)
+1 and variance , σ² =
2n1n2(2n1n2−n1−n2)
(n1+n2)²(n1+n2−1)
Test statistics
z=
(r−և)
σ²
~ N(0,1)
Level of significance:
α=0.05, unless we are given
Critical value :
z is obtained from table according to level of significance and alternative
hypothesis
Decision: accept Hₒ if |z |< ztabulated, reject otherwise
11
Example:
46,58,60,56,70,66,48,54,62,41,39,52,45,62,53,69,65,65,67,76,
52,52,59,59,67,51,46,61,40,43,42,77,67,63,59,63,63,72,57,59,
42,56,47,62,67,70,63,66,69 and 73
Here sample size (n)=50, median=59.5
assign A>median, B<median , delete if number=median
Thus, the sample becomes:
BB A B AA BB A BBBB A B AAAAA BBBB A BB A BBB AAA B AAA BBBBB
AAAAAAA
no. of A (n1)=25, no. of B (n2)=25, r=20
և=
2 n1n2
n1+n2
+1 =26 and σ² =
2 n1n2(2 n1n2−n1−n2)
(n1+n2)²(n1+n2−1)
= 12.2
σ= 12.2 = 3.5 12
Problem to test:
Hₒ : sample are in random order
Hˌ : sample are not in random order.
Test statistics:
z=(r-և)/σ =(20-26)/3.5 = -1.71
Critical value:
At α=0.05, ztabulated = 1.96
Decision:
|z|=1.71<ztabulated=1.96, we accept Hₒ at α=0.05
Conclusion:
The sample are in random order.
13
Based on the direction of differences between two measures
Test statistic
• Under H0 test statistic is, X0 = minimum of n(+) or n(-)
Critical region:
• K =
(𝑛−1)
2
− 0.98 𝑛
Decision:
 Null hypothesis is rejected if X0 ≤ K otherwise accepted
Two types of sign test:
 Small sample case (n ≤ 25)
 Large sample case (n > 25)
Sign test
14
Example: A study was designed to determine the effect if a certain movie on
the moral attitude of young children. The data below represents a rating from 0
to 20 on a moral attitude to high morality. Carry out the test hypothesis that
movie had no effect on moral attitude of children against it had using sign test
at level 0.1. [T.U 2069]
15
Before 14 16 15 18 15 17 19 17 17 16 14 15
After 13 18 16 17 16 19 20 18 19 15 18 16
Solution
16
Before 14 16 15 18 15 17 19 17 17 16 14 15
After 13 18 16 17 16 19 20 18 19 15 18 16
Sign + - - + - - - - - + - -
Null Hypothesis (H0): M 𝑑1
= M 𝑑2
i.e. , movie has effect on moral attitude of children
Alternative hypothesis (H1): M 𝑑1
≠ M 𝑑2
, i.e., movie has no effect on moral attitude of
children
Number of + signs (+) = 3
Number of – signs (-) = 9
Test statistic: P0 = min{n(+),n(-)} = min {3,9} = 3
Critical value:
K =
(n−1)
2
− 0.98 n =
(12−1)
2
− 0.98 × 12 = 2.1051
For 0.1 level of significance, P-value(P0) = P(y≤ 3) = 0.073
Decision:
Since, P0=0.073<0.1= K, we reject null hypothesis, i.e., movie has effect on moral
attitude of children
Here, n>25 so
K =
(𝑛−1)
2
− 0.98 𝑛
Alternate of Chi-square test for goodness of fit when sample size
are small
Test statistic:
• D0 = Max F 𝑒 − F 𝑜
Decision:
• Accept H0, if (calculated) D0 < (tabulated) Dmax
17
Kolmogorov Smirnov test
Example: The number of disease infected tomato plants in 10 different plots of
equal size are given below: Test whether the disease infected plants are uniformly
distributed over the entire area using Kolmogorov Smirnov test.
Plot no: 1 2 3 4 5 6 7 8 9 10
No. of infected
plants
8 10 9 12 13 7 5 12 13 9
Two tailed test
H0 = Md = Infected plants are uniformly distributed
H1 ≠ Md = Infected plants are not uniformly distributed
Solution:
18
19
Plot no No. of
infected
plants (f0)
Cf0 Relative
observed
frequency
F0
Expected
frequency
fe
Cfe Expected
relative
frequency
(Fe)
Fe − F0
1 8 8 8/100 10 10 10/100 0.02
2 10 18 18/100 10 20 20/100 0.02
3 9 27 27/100 10 30 30/100 0.03
4 12 39 39/100 10 40 40/100 0.01
5 15 54 54/100 10 50 50/100 0.04
6 7 61 61/100 10 60 60/100 0.01
7 5 66 66/100 10 70 70/100 0.04
8 12 78 78/100 10 80 80/100 0.02
9 13 91 91/100 10 90 90/100 0.01
10 9 100 100/100 10 100 100/100 1
Test statistic:
Calculated D0 = Max Fe − F0 = 0.04 =
4
100
Tabulated D0 = 0.136
Decision:
Since calculated D0 = 0.04 < tabulated D0 = 0.136, we accept null
hypothesis, i.e., the infected plants are uniformly distributed.
20
Since n>40,
tabulated D0 =
1.36
𝑛
=
1.36
100
=
1.36
10
= 0.136
21
Cochran Q test
• Non-parametric test used or more than two related samples.
• Use to test significance difference in frequencies or properties
of given samples.
• Problem to test:-
Hₒ:- All the treatment are equally effective.
Hi :- All the treatment are not equally effective
Test statistics:
Decision:- Reject Hₒ if Q >
Q =
K−1 {k R 𝑖
2
− R2 }
k C 𝑗 − C 𝑗
2 ~ χ2
(𝑘−1)df
Friedman F test
• Used to test significant different between location of three or
more populations.
• Problem to test:-
Hₒ:- Md₁ = Md₂ = Md₃=Mdk
H1:- at least one md is different,
I=1,2,3,……,k
Test statistics:-
Decision:- Accept H0 if p > 𝛼 or reject
F 𝑟 =
12
nk(k+1)
R 𝑖
2
- 3n (k+1)
F 𝑟 =
12
nk(k+1)
R 𝑖
2
− 3n (k+1)
1−
t 𝑖
3 −t 𝑖
𝑛(k3 −𝐾)
References:
Probability and Statistics (Course book)
Teachers note
google.com
wikipedia.com
24
25
Queries

Statistics-Non parametric test

  • 1.
    1Presented by: Group-E(The Anonymous) Gagan Puri Bikram Bhurtel Rabin B.K Bimal Pradhan
  • 2.
    What is parametrictest? What is non-parametric test? Difference between parametric and non-parametric test Chi-square test (χ2) Run test Sign test Kolmogorov Smirnov test Cochran Q. test Friedman F. test References 2
  • 3.
    ● Parametric testsinvolve estimation parameters such as the mean and assume that distribution of sample means are normally distributed ● Non parametric tests were developed for these situations where fewer assumptions have to be made ● Some times called distribution free tests ● There are some assumptions in nonparametric test but are of less stringent 3
  • 4.
    S/n Parametric testS/n Non parametric test 1 It specifies certain condition about parameter of the population from which sample is selected. 1 It doesn't specifies certain condition about parameter of the population from which sample is selected. 2 It is used in testing of hypothesis and estimation of parameters. 2 It is used in testing of hypothesis but not in estimation of parameters. 3 Mostly it is used in data measured in interval and ratio scale. 3 It is used in data measured in nominal and ordinal scale. 4 It is most powerful 4 It is less powerful 5 It requires complicated sampling technique 5 It doesn’t require complicated sampling techinque. Difference between parametric and non-parametric test
  • 5.
    ➢Used to testsignificant difference between observed frequencies and expected frequencies ➢Let ‘n’ be the observation of the random variable x and classified k (no) of classes WRT frequencies ➢ Problem to test: ➢ Chi-square test 5 𝐻1: 𝑂𝑖 ≠ 𝐸𝑖 𝐻1: 𝑂𝑖 = 𝐸𝑖
  • 6.
    χ2 = 𝑖=1 𝑘 𝑂𝑖 − 𝐸𝑖 2 𝐸𝑖 ∼χ2 𝑘−1 Test statistics If be the the level of significance and (k-1) be degree of freedom then critical value is obtained from table Decision: then we reject the null hypothesis otherwise accepted α If χ 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 > χ 𝑡𝑎𝑏𝑢𝑙𝑡𝑒𝑑 2
  • 7.
     Sequence ofletter of one kind bounded by letter of other kind is called run ++++++++----------+++++++-------+++--- here no. of run = 6  Test used for testing the randomness of sequence of sample event Problem to test:- Hₒ: The sequence are in random order Hˌ: The sequence are not in random order 7
  • 8.
    For small samplesize (n1,n2 ≤ 20) Test statistics: number of runs (r) Level of significance: α=0.05 unless we are given Critical value: ṟ and ȓ obtained from the table according to α, degree of freedom n1 and n2 and alternative hypothesis Decision: accept Hₒ at a level of significance if rϵ(ṟ,ȓ), reject otherwise 8
  • 9.
    Example: HH TTT HT HHH TT HH T no. of heads (n1)=8, no of tails (n2)= 7, no. of runs (r)=8 Problem to test: Hₒ : the sequence are in random order Hˌ : the sequence are not in random order Test statistics: r=8 Critical value: at α=0.05, n1=8 and n2=7, ṟ=4 and ȓ=13 Decision : r=8 ϵ (ṟ=4, ȓ=13), accept Hₒ Conclusion : the sequence of H and T are in random order. 9
  • 10.
  • 11.
    For large samplesize (n1, n2 ≥ 20) r is approximately normally distributed with mean, և = 2n1n2 (n1+n2) +1 and variance , σ² = 2n1n2(2n1n2−n1−n2) (n1+n2)²(n1+n2−1) Test statistics z= (r−և) σ² ~ N(0,1) Level of significance: α=0.05, unless we are given Critical value : z is obtained from table according to level of significance and alternative hypothesis Decision: accept Hₒ if |z |< ztabulated, reject otherwise 11
  • 12.
    Example: 46,58,60,56,70,66,48,54,62,41,39,52,45,62,53,69,65,65,67,76, 52,52,59,59,67,51,46,61,40,43,42,77,67,63,59,63,63,72,57,59, 42,56,47,62,67,70,63,66,69 and 73 Heresample size (n)=50, median=59.5 assign A>median, B<median , delete if number=median Thus, the sample becomes: BB A B AA BB A BBBB A B AAAAA BBBB A BB A BBB AAA B AAA BBBBB AAAAAAA no. of A (n1)=25, no. of B (n2)=25, r=20 և= 2 n1n2 n1+n2 +1 =26 and σ² = 2 n1n2(2 n1n2−n1−n2) (n1+n2)²(n1+n2−1) = 12.2 σ= 12.2 = 3.5 12
  • 13.
    Problem to test: Hₒ: sample are in random order Hˌ : sample are not in random order. Test statistics: z=(r-և)/σ =(20-26)/3.5 = -1.71 Critical value: At α=0.05, ztabulated = 1.96 Decision: |z|=1.71<ztabulated=1.96, we accept Hₒ at α=0.05 Conclusion: The sample are in random order. 13
  • 14.
    Based on thedirection of differences between two measures Test statistic • Under H0 test statistic is, X0 = minimum of n(+) or n(-) Critical region: • K = (𝑛−1) 2 − 0.98 𝑛 Decision:  Null hypothesis is rejected if X0 ≤ K otherwise accepted Two types of sign test:  Small sample case (n ≤ 25)  Large sample case (n > 25) Sign test 14
  • 15.
    Example: A studywas designed to determine the effect if a certain movie on the moral attitude of young children. The data below represents a rating from 0 to 20 on a moral attitude to high morality. Carry out the test hypothesis that movie had no effect on moral attitude of children against it had using sign test at level 0.1. [T.U 2069] 15 Before 14 16 15 18 15 17 19 17 17 16 14 15 After 13 18 16 17 16 19 20 18 19 15 18 16
  • 16.
    Solution 16 Before 14 1615 18 15 17 19 17 17 16 14 15 After 13 18 16 17 16 19 20 18 19 15 18 16 Sign + - - + - - - - - + - - Null Hypothesis (H0): M 𝑑1 = M 𝑑2 i.e. , movie has effect on moral attitude of children Alternative hypothesis (H1): M 𝑑1 ≠ M 𝑑2 , i.e., movie has no effect on moral attitude of children Number of + signs (+) = 3 Number of – signs (-) = 9 Test statistic: P0 = min{n(+),n(-)} = min {3,9} = 3 Critical value: K = (n−1) 2 − 0.98 n = (12−1) 2 − 0.98 × 12 = 2.1051 For 0.1 level of significance, P-value(P0) = P(y≤ 3) = 0.073 Decision: Since, P0=0.073<0.1= K, we reject null hypothesis, i.e., movie has effect on moral attitude of children Here, n>25 so K = (𝑛−1) 2 − 0.98 𝑛
  • 17.
    Alternate of Chi-squaretest for goodness of fit when sample size are small Test statistic: • D0 = Max F 𝑒 − F 𝑜 Decision: • Accept H0, if (calculated) D0 < (tabulated) Dmax 17 Kolmogorov Smirnov test
  • 18.
    Example: The numberof disease infected tomato plants in 10 different plots of equal size are given below: Test whether the disease infected plants are uniformly distributed over the entire area using Kolmogorov Smirnov test. Plot no: 1 2 3 4 5 6 7 8 9 10 No. of infected plants 8 10 9 12 13 7 5 12 13 9 Two tailed test H0 = Md = Infected plants are uniformly distributed H1 ≠ Md = Infected plants are not uniformly distributed Solution: 18
  • 19.
    19 Plot no No.of infected plants (f0) Cf0 Relative observed frequency F0 Expected frequency fe Cfe Expected relative frequency (Fe) Fe − F0 1 8 8 8/100 10 10 10/100 0.02 2 10 18 18/100 10 20 20/100 0.02 3 9 27 27/100 10 30 30/100 0.03 4 12 39 39/100 10 40 40/100 0.01 5 15 54 54/100 10 50 50/100 0.04 6 7 61 61/100 10 60 60/100 0.01 7 5 66 66/100 10 70 70/100 0.04 8 12 78 78/100 10 80 80/100 0.02 9 13 91 91/100 10 90 90/100 0.01 10 9 100 100/100 10 100 100/100 1
  • 20.
    Test statistic: Calculated D0= Max Fe − F0 = 0.04 = 4 100 Tabulated D0 = 0.136 Decision: Since calculated D0 = 0.04 < tabulated D0 = 0.136, we accept null hypothesis, i.e., the infected plants are uniformly distributed. 20 Since n>40, tabulated D0 = 1.36 𝑛 = 1.36 100 = 1.36 10 = 0.136
  • 21.
  • 22.
    Cochran Q test •Non-parametric test used or more than two related samples. • Use to test significance difference in frequencies or properties of given samples. • Problem to test:- Hₒ:- All the treatment are equally effective. Hi :- All the treatment are not equally effective Test statistics: Decision:- Reject Hₒ if Q > Q = K−1 {k R 𝑖 2 − R2 } k C 𝑗 − C 𝑗 2 ~ χ2 (𝑘−1)df
  • 23.
    Friedman F test •Used to test significant different between location of three or more populations. • Problem to test:- Hₒ:- Md₁ = Md₂ = Md₃=Mdk H1:- at least one md is different, I=1,2,3,……,k Test statistics:- Decision:- Accept H0 if p > 𝛼 or reject F 𝑟 = 12 nk(k+1) R 𝑖 2 - 3n (k+1) F 𝑟 = 12 nk(k+1) R 𝑖 2 − 3n (k+1) 1− t 𝑖 3 −t 𝑖 𝑛(k3 −𝐾)
  • 24.
    References: Probability and Statistics(Course book) Teachers note google.com wikipedia.com 24
  • 25.