This document introduces parametric tests and provides information about the t-test. It defines parametric tests as those applied to normally distributed data measured on interval or ratio scales. Parametric tests make inferences about the parameters of the probability distribution from which the sample data were drawn. Examples of common parametric tests are provided, including the t-test. The t-test is used to compare two means from independent samples or correlated samples. Steps for conducting a t-test are outlined, including calculating the t-statistic and making decisions based on critical t-values. Two examples of using a t-test on experimental data are shown.
2. 29
CHAPTER 1 INTRODUCTION TO PARAMETRIC TEST
What are parametric tests?
The parametric tests are tests applied to data that are normally distributed, the
levels of measurement of which are expressed in interval and ratio.
It is a branch of statistics which assumes that the data has come from a type
of probability distribution and makes inferences about the parameters of the
distribution.
Most well-known elementary statistical method.
Suppose we have a sample of 99 test scores with a mean of 100 and a standard
deviation of 1. If we assume all 99 test scores are random samples from a normal
distribution we predict there is a 1% chance that the 100th test score will be
higher than 102.365 (that is the mean plus 2.365 standard deviations) assuming
that the 100th test score comes from the same distribution as the others. The
normal family of distributions all has the same shape and are parameterized by
mean and standard deviation. That means if you know the mean and standard
deviation, and that the distribution is normal, you know the probability of any
future observation. Parametric statistical methods are used to compute the 2.365
value above, given 99 independent observations from the same normal
distribution.
The parametric tests are:
T-test for independent samples
T-test for Correlated Sample
Z-test for two sample means
Z-test for one sample mean
F-test (ANOVA)
r (Pearson Product Moment Coefficient of Correlation)
푦 = 푎 + 푏푥 (Simple Linear Regression Analysis)
푦 = 푏0 + 푏1 푥1 + 푏2 푥2 + ⋯ + 푏푛푥푛 (Multiple Regression Analysis)
When do we use parametric tests?
We use parametric tests when-
The distribution is normal, that is when skewness is equal to zero and kurtosis
equals .265.
The levels of measurement to be analyzed are expressed in interval and ration
data.
3. 30
Why do we use parametric tests?
They are more powerful compared to the nonparametric tests.
How do we use parametric tests?
First, determine whether the data are distributed normally and abnormally by
solving for the value of skewness and kurtosis using the formula:
푆푘 =
3(푥̅ − 푀푑)
푆퐷
퐾푢 =
푄
푃90 − 푃10
WHERE: 푄 =
푄3 −푄1
2
Second, if the result of the skewness is equal to zero and kurtosis equals .265 then
the distribution is normal.
Third, determine if the data are expressed in interval and ratio data.
What is interval data?
Interval data provide numbers that reflect differences among items. With
intervals scales the measurement units are equal.
Examples: Scores of intelligence tests and time as reckoned from the calendar.
They have no true zero value.
What is ratio data?
The ratio data are the highest type of scale. The basic difference between the
interval and ratio scale is that interval scale has no true zero value while the
ratio scale has an absolute zero value. Common ratio scales are measures of
length, width, weight, capacity loudness and others.
4. 31
CHAPTER II HYPOTHESIS TESTING
Definitions:
*is a temporary answer to a problem or question for study is called hypothesis. It may be
based by reading or observation.
*an assertion or conjecture concerning one or more population is called statistical
hypothesis.
*its represent if the experiment is true or not, it is also being tested, it is called null
hypothesis (Ho).
*it is statement that the experiment must believe to be true or wishes to prove is called
alternative hypothesis (Ha).
*rejecting the null hypothesis when it is true. It is called type I error. Alpha (α) is the
probability of type 1 error.
*accepting the null hypothesis when it is false is called type II error it is denoted by beta
(β).
*the maximum probability of type I error that the researcher is willing to commit is called
level of significance.
*a test where the alternative hypothesis specifies a one directional difference for the
parameter of interest is called one tailed test of hypothesis.
*a test where the alternative hypothesis does not specify a one directional difference for
the parameter of interest is called two tailed test of hypothesis.
*statistic whose value is calculated from sample measurements and on which the
statistical decision will be based is called test statistic.
What is the t-test for independent samples?
The t-test is a test of difference between two independent groups. The means
are compared 푥̅̅̅1 from ̅푥̅̅2 .
When do we use the t-test for independent samples?
When we compare the means of two independent groups.
When the data are normally distributed,푆푘 = 0 푎푛푑퐾푢 = .265.
When the data are expressed in interval and ratio.
When the samples are less than 30.
Why do we use the t-test for independent sample?
Because it as a more powerful test compared with other tests of difference
of two independent.
5. 32
CHAPTER III T-TEST
Is used to compare two means, the means of two independent samples or two
independent groups and the means of correlated before and after the treatment.
Is used when there are less than 30 samples, but some researchers use the t-test
even if there are more than 30 samples.
It can be used to determine if two sets of data are significantly different from each
other, and is most commonly applied when the test statistic would follow
a normal distribution if the value of a scaling term in the test statistic were
known.
푡 =
푥̅1 − 푥̅2
√[ 푆푆1 +푆푆2
푛1 + 푛2− 2
] [ 1
푛1
+ 1
푛2
]
Where:
T= the t-test
푥̅1 = 푡ℎ푒 푚푒푎푛 표푓 푔푟표푢푝 1
푥̅2 = 푡ℎ푒 푚푒푎푛 표푓 푔푟표푢푝 2
푆푆1 = 푡ℎ푒 푠푢푚 표푓 푠푞푢푎푟푒푠 표푓 푔푟표푢푝 1
푆푆2 = 푡ℎ푒 푠푢푚 표푓 푠푞푢푎푟푒푠 표푓 푔푟표푢푝 2
푛1 = 푡ℎ푒 푛푢푚푏푒푟 표푓 표푏푠푒푟푣푎푡푖표푛푠 푖푛 푔푟표푢푝 1
푛2 = 푡ℎ푒 푛푢푚푏푒푟 표푓 표푏푠푒푟푣푎푡푖표푛푠 푖푛 푔푟표푢푝 2
Examples:
1. The following are the score of 10 male and 10 female BSE students in spelling.
Test the null hypothesis that there is no significant difference between the
performance of male and female BSE students in the said test. Use the t-test at
0.05 level of significance.
2.
MALE (푋1) FEMALE (푋2
15 15
18 10
15 7
19 4
10 18
6 4
9 11
16 5
14 13
20 17
7. 34
푡 =
3.8
√[24.44]0.2
푡 = 1.72
Solving by the stepwise method
I. Problem: is there a significant relationship between the performance of
male and female students in spelling?
II. Hypothesis:
퐻푂 = There is no significant difference between the performance of male
and female BSE students in spelling.
퐻1 = There is a significant difference between the performance of male
and female BSE students in spelling.
III. Level of Significance:
훼 = .05
푑푓 = 푛1 + 푛2 − 2
= 10 + 10 − 2
= 18
푡푡푎푏푢푙푎푟 푣푎푙푢푒 푎푡 . 05 = 1.734
IV. Statistics: t-test for two independent samples
V. Decision rule: If the t- computed value is greater than or beyond the t-tabular/
critical value, reject퐻푂 .
Conclusion: Since the t- computed value of 1.72 is less than the t-tabular
value of 1.73 at 0.05 level of significance with 18 degrees of freedom, the
null hypothesis is accepted. This means that there is no significant
difference between the performance of male and female BSE students in
spelling.
Example # 2.
To find out whether a new serum would arrest leukemia, 16 patients who had all
reached an advantage stage of the disease, were selected. Eight patients received
the treatment and eight did not. The survival was taken from the time of
experiment was conducted.
9. 36
푡 =
푥̅1 − 푥̅2
√[ 푆푆1 +푆푆2
푛1 + 푛2− 2
] [ 1
푛1
+ 1
푛2
]
푡 =
2.34 − 4.68
√[4.12+ 3.87
8 + 8− 2
] [1
8
+ 1
8
]
푡 =
−2.34
√[. 57][. 25]
푡 =
−2.34
. 38
푡 = −6.16
SOLVING BY THE STEPWISE METHOD:
I. Problem: Will new serum arrest leukemia for the 8 patients who had all
reached an advance stage?
II. Hypothesis:
퐻푂 : The new serum will not arrest the leukemia of the 8 patients which
had all reached an advanced stage of the disease.
퐻1: The new serum will arrest the leukemia of the 8 patients which had all
reached an advanced stage of the disease.
III. Level of Significance:
훼 = .05
푑푓 = 푛1 + 푛2 − 2
= 8 + 8 − 2
= 14
푡푡푎푏푢푙푎푟 푣푎푙푢푒 푎푡 . 05 = 1.761
IV. Statistics: t-test for two independent samples
VI. Decision rule: If the t- computed value is greater than or beyond the t-tabular/
critical value, reject퐻푂 .
Conclusion: Since the t- computed value of -6.16 is less than the t-tabular
value of 1.761at 0.05 level of significance with 14 degrees of freedom, the
null hypothesis is accepted. The new serum will not arrest the leukemia of
the 8 patients which had all reached an advanced stage of the disease.
10. 37
CHAPTER IV T-TEST FOR CORRELATED SAMPLES
The t- test for correlated sample is used when comparing the mean of the pretest
and posttest. It is also used to compare the mean of before and after the treatment been
done. The formula is:
(Σ 퐷)2
Example:
The following are the weight in pounds of 7 individuals before and after 3 months of
taking fruit diet. Tst at 0.05 level of significant.
Weight
243 180 202 165 182 153 170
before
Weight after 231 179 200 162 179 151 164
Solution:
X1 weight before X2 weight after D D2
243 231 12 144
180 179 1 1
202 200 2 4
165 162 3 9
182 179 3 9
153 151 2 4
170 164 6 36
Σ 퐷= 29 Σ 퐷2=
207
퐷̅ = 29
7
= 4.21229
푡 =
퐷̅
√Σ 퐷2−
푛
푛(푛−1)
Where:
퐷̅
= the mean difference between the pretest and posttest.
Σ 퐷= the summation of the difference between the pretest and the posttest
Σ 퐷2= the sum of square of the difference between the pretest and the posttest
n= the sample size
11. 38
푡 =
퐷̅
(Σ 퐷)2
√Σ 퐷2−
푛
푛(푛−1)
푡 =
4.1229
√207−
(29)2
7
7(7−1)
푡 =
4.1229
√207− 120.1429
7(6)
푡 =
4.1229
√207− 120.1429
42
푡 =
4.1229
√2.0680
푡 =
4.1229
1.4381
푡 = 2.8669
Solving by Stepwise Method
I. Problem: is there a significant difference between the weight before and the
weight after on taking fruit diet?
II. Hypotheses:
Ho: There is no significant difference between the weight before and the
weight after taking fruit diet of 7 individuals.
Ha: The weight after 3 months of taking fruit diet is lesser than the weight
before.
III. Level of Significance:
α= 0.05 t0.05= 1.943
df= n-1
=7-1
=6
IV. Statistics: t-test for correlated sample
V. Decisin Rule: if the t-computed value is greater than or beyond the critical
value, reject Ho.
VI. Conclusion: The t-computed value of 2.8669is greater than t-critical value of
1.943 at 0.05 level of significance with 7 degree of freedom, the null
hypothesis is therefore reject in the favor of the research hypothesis. This
means that the weight after 3 months of taking fruit diet is lesser than the
weight before.
12. 39
CHAPTER V Z-TEST
Z-test is another statistical test under parametric statistic where normal distribution is
applied and is basically used for dealing with problems relating to large samples when n
≥ 30.
The tabular value of z-test at .01 and .05
Test Level of significance
.01 .05
One- tailed ± 2.33 ± 1.645
Two-tailed ± 2.575 ± 1.96
The One-Sample Mean Test
This one-sample mean test is used when the sample mean is being compared to the
perceived populatio0n mean.
푧 =
(푥̅ − 휇)√푛
휎
Where:
X= sample mean
μ= hypothesized value of population mean
σ= population standard deviation
n= sample mean
Example:
A school principal in a science school claimed that the reading comprehension test of
grade five pupils should have an average of 73.3 with a standard deviation of 7.8. If 50
randomly selected grade five pupils have an average of 76.7. Use the z-test at 0.05 level
of significant.
STEPWISE METHOD
I. Problem: Is the claim true that the average of reading comprehension test of
grade five pupils is 73.3?
II. Hypotheses:
Ho: The average of reading comprehension test of grade five pupils is 73.3.
Ha: The average of reading comprehension test of grade five pupils is not
73.3.
13. 40
III. Level of significance:
α= 0.05 z= ± 1.645
IV. Statistics:z-test for one-tailed test
Computation:
푧 =
(푥̅ − 휇)√푛
휎
푧 =
(76.7 − 73.3)√50
7.8
푧 =
(3.4)(7.0711)
7.8
푧 =
24.0417
7.8
푧 = 3.0823
V. Decision rule:If the z- computed value is grater than or beyond the z tabular
value, reject Ho.
VI. Conclusion:Since the z-computed value of 3.0823 is greater than 1.645 at .05
level of significance the research hypothesis is accepted which means that the
averages of reading comprehension test of grade five pupils is not 73.3.
The Two-sample mean test
This two-sample mean test is used when comparing two separate samples drawn at
random taken a normal population. To test the difference between the two values of the
mean sample 1 and the mean sample 2 is significant or can be attributed to chance, the
formula is used:
푧 =
̅푥̅1̅ − 푥̅2
√
푠2
1
푛1
+
푠2
2
푛2
Where:
푥̅1 = the mean of sample 1
푥̅2 = the mean of sample 2
푠1
2 = the variance of sample 1
2 = the variance of sample 2
푠2
푛1 = size of sample 1
푛2 = size of sample 2
14. 41
Example:
An admission test was administered to incoming freshmen in the colleges of engineering
and architecture with 100 students. Each was randomly selected. The mean scores of the
given samples were 푥̅1= 95 and 푥̅2=90 and the variances of the test scores were 50 and
45, respectively. Is there a significant difference between the two groups? Use .01 level
of significant.
STEPWISE METHOD
I. Problem: Is there a significant difference between the two groups?
II. Hypotheses:
Ho:푥̅1 = 푥̅2
Ha:푥̅1 ≠ 푥̅2
III. Level of Significant:
α= .01
z= ± 2.575
IV. Statistics: z-test for two tailed test
푍 =
푥̅1 + 푥̅2
2
√푆1
푛1
2
+ 푆2
푛2
푍 =
95 − 90
√(50) 2
100
−
(45)2
100
푍 =
5
√25 − 20.25
푍 =
5
√4.72
푍 =
5
2.1726
푍 = 2.3014
V. Decision Rule: if the z-computed is greater than or beyond the tabular value,
reject Ho.
VI. Conclusion:
Since the z computed is 2.3014 is beyond the z tabular value of ± 2.575 at .01
level of significant the researcher hypothesis is reject. There is
15. 42
CHAPTER VI F-TEST OR ANALYSIS OF VARIANCE ( ANOVA )
used in comparing the means of two or more independent groups
it can be one-way or two-way ANOVA, one-way if one variable involved
and two-way if two variables involved: the column and row variables.
Example:
A pharmacy is selling 4 brands of medicine for headache. The owner is interested if there
is a significant difference in the average sales of the medicine in one week. The following
data are recorded:
Brand A Brand B Brand C Brand D
10 3 5 7
5 7 4 8
8 14 10 12
3 7 6 2
6 12 3 7
9 6 9 10
13 8 7 5
Perform the ANOVA and test the hypothesis at .05 level of significance that the average
sales of 4 brands of medicine for headache are equal.
To solve these, we need to use the Stepwise Method to become organize.
I. Problem: Is there a significant difference in the average sales of 4 brands of medicine
for headache.
II. Hypotheses:
Ho: There is no significant difference in the average sales of 4 brands of medicine
for headache.
Ha: There is a significant difference in the average sales of 4 brands of medicine
for headache.
III. Level of Significance
α=.05
df= k-1 and (N-1)-(k-1) df= 4-1=3
where: k=classes or the brands df=(28-1)-(4-1)
N=total population =27-3=24
16. 43
IV. Statistics
F-test one-way-analysis of variance computation.
A B C D
X1 X1
2 X2 X2
2 X3 X3
2 X4 X4
2
10 100 3 9 5 25 7 49
5 25 7 49 4 16 8 64
8 64 14 196 10 100 12 144
3 9 7 49 6 36 2 4
6 36 12 144 3 9 7 49
9 81 6 36 9 81 10 100
13 169 8 64 7 49 5 25
ΣX1=54 ΣX1
2=484 ΣX2=57 ΣX2
2=547 ΣX3=44 ΣX3
2=316 ΣX4=51 ΣX4
2=435
N1=7 N1=7 N1=7 N1=7
X1=7.71 X2=8.14 X3=6.29 X4=7.29
We need to square all the population and find the summation of it. Then, to find the mean
of every brand, simply divide the ΣX to N.
Next, find the CF, TSS, BSS, WSS, MSB, MSW, and the F-computed value by using the
different formulas.
CF= (ΣX1+ΣX2+ ΣX3+ ΣX4) 2=(54+57+44+51) 2= (206) 2= 42436 = 1515.57
N1+N2+N3+N4 7+7+7+7 28 28
TSS= ΣX1
2+ ΣX2
2+ ΣX3
2+ ΣX4
2-CF
=48+547+316+435-1515.57
=1782-1515.57
=266.43
BSS=( ΣX1)2 + (ΣX2)2 + (ΣX3)2+( ΣX4)2 – CF
N1 N2 N3 N4
= ( 54)2 + (57)2 + (44)2+ ( 51)2 – 1515.57
7 7 7 7
=416.57+464.14+276.57+371.57 -1515.57
=1528.85-1515.57
=13.28
WSS= TSS-BSS
= 266.43-13.28
=253.15
MSB= SSB MSW= SSW F= MSB
k-1 N-k MSW
=13.28 =253.15 = 4.43
4-1 28-4 10.55
=4.43 =10.55 =0.42
To become easy for us, we can organize it using a table.
17. 44
ANOVA TABLE
Sources of
Variations
Degrees of
Freedom
(Df)
Sum of
Squares
Mean
Squares
F- Value
Computed
F- Value
Tabular
Between
Groups (k-
1)
3 13.28 4.43 0.42 3.01
Within
Groups (N-
1)-(k-1)
24 253.15 10.55
Total (N-1) 27 266.43
V. Decision Rule: If the F computed Value is greater than F tabular Value, reject Ho.
VI. Conclusion: Since the F- computed value of 0.42 is less than the F tabular value of
3.01 at .05 level of significance with 3 and 24 degrees of freedom, the null hypothesis is
fail to reject. It means that there is no significant difference in the average sales of 4
brands of medicine for headache.
18. 45
CHAPTER VII SCHEFFḔS TEST
F-test tells if there is a significant difference in two or more independent
variables but a test to find where the difference lies, we can use the
Scheffes Test.
Formula:
F’= (X1-X2) 2
SW 2 (n1+n2)
n1n2
Where: F’=Scheffes Test
X1 =mean of group 1
X2=mean of group 2
n1=number samples of group 1
n2=number samples of group 2
SW 2=within mean squares
Example: (Refer to the 4 brands of medicine and ANOVA Table)
Compare the different brands.
A vs. B A vs. C A vs. D
F’= (7.71-8.14) 2 F’= (7.71-6.29) 2 F’= (7.71-7.29) 2
26.03 (7+7) 26.03 (7+7) 26.03 (7+7)
= 0.1849 = 1.42 = 0.1764
7.44 7.44 7.44
F’=0.02 F’=0.19 F’=0.02
B vs. C B vs. D C vs. D
F’= (8.14-6.29) 2 F’= (8.14-7.29) 2 F’= (6.29-7.29) 2
26.03 (7+7) 26.03 (7+7) 26.03 (7+7)
= 3.4225 = 0.7225 = 1
7.44 7.44 7.44
F’=0.46 F’=0.097/0.10 F’=0.13
19. 46
Comparison of the Average Sales of the 4 brands of Medicine for headache
Between
Brand
F’ F .05
(k-1)
(3.01) (3)
Interpretation
A vs. B 0.02 9.03 not significant
A vs. C 0.19 9.03 not significant
A vs. D 0.02 9.03 not significant
B vs. C 0.46 9.03 not significant
B vs. D 0.10 9.03 not significant
C vs. D 0.13 9.03 not significant
The above table shows that there is no significant difference between the different brands.
All of the brands of medicine have close average sales.
20. 47
CHAPTER VIII F-TEST( TWO WAY ANOVA WITH INTERACTION
EFFECT)
involved row and column variable
Example:
Fifty four students were randomly selected to one of three instructors and to one
of the three methods of teaching. Achievement was measured on a test administered at
the end of the term. Use two-way ANOVA with interaction effect at 0.05 level of
significance to test the following hypotheses:
1. Ho: There is no significant difference in the performance of the three groups of
students under three instructors.
Ha: There is a significant difference in the performance of the three groups of students
under three instructors.
2. Ho: There is no significant difference in the performance of the three groups of
students under three different methods of teaching.
Ha: There is a significant difference in the performance of the three groups of students
under three different methods of teaching.
3. Ho: Interaction effects are not present.
Ha: interaction effects are present.
Two-factor ANOVA with significant interaction
Teacher Factor
A B C
Method of
Teaching 1
45 39 38
38 44 31
40 47 45
36 36 43
25 32 43
41 30 40
Total
Method of
Teaching 2
49 50 48
38 35 32
30 39 47
32 46 49
41 44 34
36 37 38
Total
21. 48
46 29 45
43 40 32
48 47 36
50 49 48
32 48 47
35 37 47
Total
Total
We can now use the Stepwise Method.
I. Problem: 1. Is there a significant difference in the performance of students under three
different instructors?
2. Is there a significant difference in the performance of students under the
three different methods of teaching?
3. Is there an interaction effect between teachers and method of teaching
factors?
II. Hypotheses:
1. Ho: There is no significant difference in the performance of the three
groups of students under three instructors.
Ha: There is a significant difference in the performance of the three groups
of students under three instructors.
2. Ho: There is no significant difference in the performance of the three
groups of students under three different methods of teaching.
Ha: There is a significant difference in the performance of the three groups
of students under three different methods of teaching.
3. Ho: Interaction effects are not present.
Ha: interaction effects are present.
III. Level of Significance
α=.05
df total=N-1
df within=k(n-1)
df column=c-1
df row=r-1
df c-r= (c-1)(r-1)
These are the formula to find the degrees of freedom. Later, we will substitute the
formula.
22. 49
IV. Statistics:
F-Test Two-Way-ANOVA with interaction effect
Two-factor ANOVA with significant interaction
Teacher Factor
A B C
Method of
Teaching 1
45 39 38
38 44 31
40 47 45
36 36 43
25 32 43
41 30 40
Total 225 228 240 Σ=693
Method of
Teaching 2
49 50 48
38 35 32
30 39 47
32 46 49
41 44 34
36 37 38
Total 226 251 248
Σ=725
Method of
Teaching 3
46 29 45
43 40 32
48 47 36
50 49 48
32 48 47
35 37 47
Total 254 250 255 Σ=759
Total 705 729 743
Σ=2177
First, you need to add Method of Teaching 1 under A, B and C. Then, sum it up using
summation symbol. That is 225+228+240=693.
Do it again to Method of Teaching 2, add the total of A, B and C. That is
226+251+248=725. Then, Method of teaching 3, add the total. That is
254+250+255=759.
You need to add the total of A and that is 705. Next, is B that is 729.Then, C that is 743.
Last, add the total summation of the three Methods of Teaching. That is, Σ=693+ Σ=725+
Σ=759 is equal to Σ=2177.
CF= (GT)2 = (2177)2 = 4739329 = 87765.35
N 54 54
23. 50
GT is the total over N which is the population.
SST=X1
2+…+X54
2 –CF
= 90045-87765.35
= 2279.65
To find SST, you need to square all of the scores. Then, sum it up. That is, 90045.
SSW= 90045- (225)2+(226)2+(254)2+(228)2+(251)2+(250)2+(240)2+(248)2+(255)2
6 6 6 6 6 6 6 6 6
=90045-
(8437.5+8512.67+10752.67+8664+10500.17+10416.67+9600+10250.67+10837)
=90045-87971.85
=2073.15
To find SSW, you need to square all of the scores. Then, sum it up. That is, 90045.
You will subtract it to the square of every total of Methods of teaching of the 3
instructors.
SSc= (705)2 + (729)2 + (743)2 – CF
18
= 1580515 - 87765.35
18
= 87806.39 – 87765.35
=41.04
To find SSC, just square the total in every column. Then, 18 are the total students in one
column.
SSr= (693)2 + (725)2 + (759)2 – CF
18
= 1581955 – 87765.35
18
= 87886.39-87765.35
=121.04
To find SSr, just square the total in every row. Then 18 are the total number of students in
6 rows.
SSc-r= SST-SSW-SSC-SSr
= 2279.65-2073.15-41.04-121.04
= 44.42
Now, we can find the degrees of freedom,
df total=N-1 =54-1=53
df within=k(n-1) =9(6-1)=45
df column=c-1 =3-1=2
df row=r-1 =3-1=2
df c-r= (c-1)(r-1) =(3-1)(3-1)=(2)(2)=4
To become organize, we will put it in a table.
24. 51
ANOVA TABLE
Sources of
Variation
SS df MS F-Value
Computed Tabular
Interpretation
Between
Columns
41.04 2 20.52 0.45 3.21 Not
significant
Rows 121.04 2 60.52 1.31 3.21 Not
significant
Interaction 44.42 4 11.11 0.24 2.59 Not
significant
Within 2073.15 45 46.07
Total 2279.65 53
To find MS or mean square, just divide SS to df.
F- Value Computed:
Columns= MSC = 20.52 =0.45
MSW 46.07
Row= MSr = 60.52 =1.31
MSW 46.07
Interaction= = MSI = 11.11 =0.24
MSW 46.07
F- Value Tabular at .05
Columns df= 2/45= 3.21
Row df= 2/45= 3.21
Interaction df= 4/45= 2.59
Look for the f- tabular value.
V. Decision Rule: If the F Value is greater than the F critical/tabular value, reject Ho.
VI. Conclusion: With the computed F-value (column) of 0.45 compared to the F-tabular
value of 3.21 at .05 level of significance with 2 and 45 degrees of freedom, the null
hypothesis is accepted which means that there is no significant difference in the
performance of the three groups of students under three instructors.
With regard to the F-value (row) of 1.31, it is lesser than the F-tabular
value of 3.21 at .05 level of significance with 2 and 45 degrees of freedom. The null
hypothesis of no significant differences in the performance of the students under the three
different methods of teaching is accepted.
However, the F-value (interaction) of 0.24 is lesser than the F-tabular
value of 2.59 at .05 level of significance with 4 and 45 degrees of freedom. Thus, the
research hypothesis is rejected which means that interaction effect is not present.
25. 52
CHAPTER IX PEARSON PRODUCT COEFFICIENT OF CORRELATION
is an index of relationship between two variables. Independent variable is
represented as x while dependent variable is represented as y. however if x
and y are independent to each other r is equal to zero.
X and y coordinates
*if r is positive the line in the graph is going upward, it says that if x increase y also
increase then x and y is positive correlated
Graph
*if the value of r is negative then the line in the graph is going downward. It says that is x
increase the value of y is decreasing, then x and y is being negatively correlated.
Graph
*if the value of r is zero the line in the graph cannot be going either upward or
downward. It says that there is no correlation between x and y variables.
Graph
Why do we use r?
- To analyze if a relationship exist between two variables.
- It is more powerful test of relationship compared to other test
26. 53
When do we use r?
- To determine the index of relationship between two variables independent and
dependent
- X variable is influence y variable, or should we say y variable dependent to x
variable, if there is relationship between the two.
Steps in solving for value of r
1. Determine the value of observation n
2. Get the sum of x that is Σ 푥the independent variable square every x
observation and get the sum of x² and y²
3. Multiply the x and y, place the product in column xy and get the sum of Σ 푥푦
4. Apply the formula indicated and compare the computed r with the r tabular
value at a certain level of significance with n-2 degree of freedom. If the
computed r is greater than the r tabular values disconfirm the null hypothesis
and confirm the research hypothesis which means that there is a significant
relationship between the two variables.
푟 =
푛 Σ 푥푦 − Σ 푥 Σ 푦
√(푛 Σ 푥² − (Σ 푥)²)(푛 Σ 푦² − (Σ 푦)²)
Where:
R= Pearson Product Moment Coefficient of Correlation r?
N= sample size
Σ 푥푦= sum of product of x and y
Σ 푥 Σ 푦=product of the sum of Ex and Ey
Σ 푥² = sum of squares of x
Σ 푦²= sum of squares of y
Example1
Below are the grades of the students in midterm and final exam
Let midterm exam =x
Final exam = y
X 65 70 75 80 85 90 95 70 85 80
Y 70 75 80 85 90 95 95 80 70 85
27. 54
Solving by stepwise method
I. Problem: is there is a significant relationship between the midterm exam and
final exam of 10 students?
II. Hypothesis
Ho: there is no significant relationship between the midterm grades and the
final grades of 10 students in mathematics.
Ha: there is significant relationship between the midterm grades and the final
grades of 10 students in mathematics.
III. Level of significance
α=.05
df= n-2
=10-2
=8
r=.632
IV. Solution:
x y x² y² xy
65 70 4225 4900 4550
70 75 4900 5625 5250
75 80 5625 6400 6000
80 85 6400 7225 6800
85 90 7225 8100 7650
90 95 8100 9025 8550
95 95 9025 9025 9025
70 80 4900 6400 5600
85 70 7225 4900 5950
80 85 6400 7225 6800
Σ 풙: 795 Σ 푦: 825 Σ 풙 ²: 64025 Σ 푦 ²: 68825 Σ 풙 Σ 푦: 66175
28. 55
Substitution
푟 =
푛 Σ 푥푦 − Σ 푥 Σ 푦
√(푛 Σ 푥² − (Σ 푥)²)(푛 Σ 푦² − (Σ 푦)²)
푟 = 10(66175)−(795)(825 )
√[10(64025)−(795) 2][10(68825) −(825)²
푟 = 10 (5875 )
√62.715625
푟 = 58750
7919 .320
r= 7.4186
V. Decision rule: if the computed r value is greater than the r tabular reject Ho.
VI. Conclusion: since the computed value is greater than the tabular value, reject
Ho.
29. 56
CHAPTER X SIMPLE LINEAR REGRESSION
An analysis used when there is a significant relationship between two
variables
Used in predicting the value of y given the value of x
FORMULA
y = ax + b
Wherein:
y = dependent variable
x = independent variable
a = y-intercept
x = slope
Example:
A study is conducted on the relationship on the number of absences and the
grades of 20 students in mathematics. Using r at 0.05 level of significance and the
hypothesis that there is no significant relationship between the absences and grade of the
students in mathematics. Determine the relationship using the following data:
Let x as the number of absences and y as the grades in mathematics.
Number of Absences Grades in MATHEMATICS
(x) (y)
2 89
3 78
2 90
1 88
1 82
5 82
6 80
3 75
1 89
8 92
4 65
5 80
9 82
10 85
5 87
5 88
1 89
1 75
2 93
4 90
30. 57
STEPWISE METHOD:
I. Problem: Is there a significant relationship between the number of absences and the
grades of 20 student in Math class?
II. Hypotheses
Ho: There is significant relationship between the number of absences and the
grades of 20 student in Math class.
H1: There is no significant relationship between the number of absences and the
grades of 20 student in Math class.
III. Level of Significance
α = 0.05
df = n – 2
= 20 – 2
df = 18
r at 0.05 = -0.444
IV. Statistics
Using Pearson Product of Coefficient of Correlation
X y x2 y2 xy
2 89 4 7921 178
3 78 9 6084 234
2 90 4 8100 180
1 88 1 7744 88
1 82 1 6724 82
5 82 25 6724 410
6 80 36 6400 480
3 75 9 5625 225
1 89 1 7921 89
8 92 64 8464 736
4 65 16 4225 260
5 80 25 6400 400
9 82 81 6724 738
10 85 100 7225 850
5 87 25 7569 435
5 88 25 7744 440
1 89 1 7921 89
1 75 1 5625 75
2 93 4 8649 186
4 ___ ___90___ __16__ _8100_ 360
Σ x= 78 Σy =1679 Σ x2= 448 Σy2 = 141889 Σ = 6535
n = 20 n = 20
31. 58
x = 3.9 y = 83.95
r =
푛 Σ 푥푦 − Σ푥 Σ푦
√[푛 Σ푥2−(Σ푥)2][푛 Σ푦2− (Σ푦)2]
r =
20 (6535) −(78) (1679 )
√[20 (1679 )−(78)2][20 (141889 )−(1679 )2]
130700 −130962
r =
√[33580 −6084 ][2837780 −2819041 ]
r =
−262
√2749 6− 18739
r =
−262
93.56
r = - 2.80
V. Decision Rule: If the r computed value is greater than or beyond the critical value
reject the H0.
VI. Conclusion: The r computed value of -2.80 is beyond the critical value of -0.444 at
0.05 level of significance with 18 degrees of freedom, so the null
hypothesis is rejected. This means that there is a significant
relationship between the number of absences and the grades of the
students in Mathematics. Since the value of r is negative, it implies
that students who have more absences had lower grades
Suppose we want to predict the grade(y) of the students who incurred
8 absences(x).To get the value of r given the value of x, the simple
linear analysis will be used.
y = a + bx will be the regression equation
Solve for a and b
b =
푛 Σ 푥푦 − Σ푥 Σ푦
푛 Σ푥2 −(Σ푥)2
b =
20 (6535)−(78)(1679 )
20 (1679 )−(78)2
b =
−262
33580 −6084
b =
−262
2949 6
b = -0.0096
a = y – bx
a = 83.95 – (-2.80)(3.9)
a = 83.95 – 10.92
a = 73.03
y = a + bx
y = 73.03 + (-0.0096)(8)
y = 73.03 – 0.0768
y = 72.95 or 73 is the grade
32. 59
CHAPTER XI MULTIPLE REGRESSION ANALYSIS
Multiple regression analysis is used in predictions. The dependent variable can used to
predict the given several independent variables.
Many mathematical formulas can be used to predict to express the relationship of two
variables. However, the most commonly use in statistics are linear equations. The
formula is:
y = bo + b1 x1 +b2 x2 + … + bn xn
Where:
y = dependent variable to be predicted
xi ,x2 … xn = known independent variables that may influence y
bo ,b1 ,b2 … bn = numerical constants which must be determined from
ퟐobserved data
ퟐ
For instance, when there are two independent variables x1 and x2 we want to fit the
equation, the equation model:
y = bo + b1 x1 +b2 x2
We must solve the following normal equations:
Σy = nbo + Σx1b1 + Σx2b2
Σx1y = Σx1bo + Σx1b1 + Σx2b2
Σx2y = Σx2bo + Σx1x2b1 + Σ풙b2
Example
The following are data on the ages and incomes of a random sample of 5 teachers
working in Cavite State Univeristy – Imus Campus and their academic achievements
while in college.
Income (In thousand
Age
Academic Achievement
pesos)
X1
X2
Y
80.5
75.6
85.6
78.0
67.8
35
32
45
25
27
1.25
2.00
1.75
1.75
2.25
33. 60
a.) Fit an equation of the form y = bo + b1 x1 +b2 x2 to the equation of the given data.
b.) Use the equation obtained in (a) to estimate the average income of a 30 – year old
teacher with 1.25 academic achievement.
Computations:
y x1 x2 푥2 2 1
푥2
x1y x2y x1x2
80.5 35 1.2
5
1225 1.56 2817.5 100.63 43.75
75.6 32 2.0
0
1024 4 2419.2 151.2 64
85.6 45 1.7
5
2025 3.06 3852 149.8 78.75
78.0 25 1.7
5
625 3.06 1950 136.5 43.75
67.8 27 2.2
5
729 5.06 1830.6 152.55 60.75
Σy=387
.5
Σx1=16
4
Σ
x2
= 9
2=56
Σ푥1
28
2=16.
Σ푥2
74
Σx1y=12869
.3
Σx2y=690.
58
Σx1x2=2
91
Solve for b0, b1 and b2 using the equations:
1.) Σy = nbo + Σx1b1 + Σx2b2
387.5 = 5bo + 164bo + 9b2
2.) Σx1y = Σx1bo + Σ풙ퟏퟐ
b1 + Σx1x2b2
12869.3 = 164bo + 5628b1 + 291b2
3.) Σx2y = Σx2bo + Σx1x2b1 + Σ풙ퟐퟐ
b2
690.58 = 9bo + 291b1 + 16.74b2
Step1. Eliminate bo by using equation 1 and 2. Look at their numerical coefficients.
Divide 164 by 5 and the quotient is 32.8. if you multiply 32.8 by 5 the product is 164. So
we can eliminate bo by subtraction.
Step2. Multiply equation 1 by 32.8 making a new equation 4. Subtract equation 2 from
the equation 4. The result is equation 5.
4.) 12710 = 164bo + 5379.2b1 + 295.2b2
2.) 12869.3 = 164bo + 5628b1 + 291b2
5.) -159.3 = 0 + 248.8b1 + 4.2b2
Step3. Eliminate bo using equations 1 and 3. Look at their numerical coefficients. Try to
divide 9 by 5 and the quotient is 1.8. if you try to multiply 1.8 by 5 the product is 9. So
you can again eliminate bo by subtraction.
34. 61
Step4. Multiply equation 1 by 1.8 making new equation 6. Subtract equation 3 from
equation 6. The result is equation 7.
6.) 697.5 = 9bo + 295.2b1 + 16.2b2
3.) 690.58 = 9bo + 291b1 + 16.74b2
7.) 6.92 = 0 + 4.2b1 – 0.54b2
Step5. Equations 5 and 7 have b1 and b2 to be eliminated. To eliminate b1, multiply the
equation 5 by the numerical coefficient of b1 of equation 7, making a new equation 8.
Likewise multiply equation 7 by the numerical coefficient of b1 of equation 5, making a
new equation 9. Then add.
8.) -669.06 = 1044.96b1 + 17.64b2
9.) 1721.696 = 1044.96b1 – 134.352b2
1052.636 = 0 – 116.712b2
-9.02 = b2
Step6. Solve for b1 using either equation 5 or 7. Using equation 5:
5.) -159.3 = 248.8b1 + 4.2b2
-159.3 = 248.8b1 + 4.2(-9.02)
-159.3 = 248.8b1 + 37.884
37.884 – 159.3 = 248.8b1
-121.416 = 248.8b1
-0.49 = b1
Step7. Solve for bo using either equations 1, 2, or 3. Using equation 1:
1.) 387.5 = 5bo + 164b1 + 9b2
387.5 = 5bo + 164(-0.94) + 9(-9.02)
387.5 = 5bo – 80.36 – 81.18
= 81.18 + 80.36 + 387.5 = 5bo
= 549.04 = 5bo
= 109.81 = bo
a.) The regression equation is
y = bo + b1 x1 +b2 x2
y = 109.81 – 0.49x1 – 9.02x2
b.) To estimate the average income (y) of a 30 – year old (x1) teacher with 1.25 academic
achievements (x2):
y = 109.81 – 0.49x1 – 9.02x2
y = 109.81 – 0.49(30) – 9.02(1.25)
y = 109.81 – 14.7 – 11.275
y = 83.84
35. 62
ASSESSMENT
1 . Ten students were given an attitude test on a controversial issue. Then they were
shown a film favorable to the 10subjeificance.cts and the same attitude test was
administered. Use t-test for correlated samples if there is no significant relationship
between the film showing favorable to ten subjects and the same attitude test
administered at α=.05 level of significance.(use t-test)
Pretest Post test
16 25
17 23
16 24
23 28
19 28
25 30
20 23
18 24
15 19
15 15
2. A certain livelihood program was given to 20 farmers to enhance their income. The
data were recorded before and after the implementation as shown below. Use t-test for
correlated samples to know if there is a significant relationship between before and after
the livelihood program at α=.05 level of significance. .(use t-test)
Before Implementation After Implementation
5,000 6,000
7,500 7,000
8,000 10,000
7,000 8,000
7,000 7,000
8,000 8,000
8,500 9,000
10,000 10,500
6,000 7,000
7,000 8,000
5,000 10,000
6,000 7,000
5,500 6,000
8,000 7,500
10,000 10,000
36. 63
5,000 5,500
8,000 7,500
6,500 8,000
7,000 7,500
8,500 10,000
3. A researcher wishes to study the number of hours IT officer spends using their
computer in the different companies. The researcher selected a sample of 10 IT officers
in banking, insurance and sales. At the .05 level of significance, can he conclude that
there is a significant difference in the mean number of hours spent by IT officers in using
computers per week by different type of companies? Use the stepwise method.(use f test)
Banking Insurance Sales
30 41 40
23 35 35
25 29 39
33 37 38
27 37 30
27 28 28
31 42 42
34 26 37
26 38 41
25 39 40
4. From you answer above, use the sheffes test to know if there is a difference that lies in
the three company type.
5. Below are the weight and height of college students ( use pearson)
Let weight=x
Height=y
Weight 60 62 63 65 65
Height 102 120 130 150 120
6. Below are the costs of chocolate bar and the white chocolate bar
X 10 20 15 5 25 30 35 40
Y 1 10 19 3 7 13 9 11
Let: chocolate=x
White chocolate=y
37. 64
7. Below are the age of girl and boy in there teen age days. (use pearson)
X 17 16 15 14 17 16
Y 13 14 15 16 17 12
Let: boy=x
Girl=y