This document discusses analysis of variance (ANOVA) and its use in comparing the means of two or more populations. It provides an example of using a one-way ANOVA to test whether there are differences between the performances of three salesmen based on their revenue amounts over five occasions. Tables are included reproducing the data with column totals and sums of squares, and the correction factor is calculated. Finally, a second example is given about checking for differences between four training programs based on employee test scores.
This presentation explains the procedure involved in two-way repeated measures ANOVA(within-within design). An illustration has been discussed by using the functionality of SPSS.
Answer Sheet ISDS 361B
HW5
YOUR FULL NAME:
Answer
Optional for Partial Credit
1.
[12 pts.]
Forecast for Saturday in week 6:
No. of Periods of Data Collected =
No. of Periods in Season =
MSE =
MAD =
MAPE =
LAD =
2.
[18 pts.]
a.
Forecast (exponential smoothing):
No. of Periods of Data Collected =
Smoothing Constant (alpha) =
Initial Forecast Value =
b.
Forecast (weighted MA):
MSE =
MAD =
MAPE =
LAD =
c.
I would recommend weighted MA (True/ False):
3.
[12 pts.]
Forecast for year 8:
MSE =
MAD =
MAPE =
LAD =
4.
[18 pts.]
Forecast for Summer Year 5:
No. of Periods of Data Collected =
No. of Periods in Season =
Coefficients:
Intercept =
Period =
1 =
2 =
Page 1 of 2
1) What is the critical F value for a sample of 7 observations in the numerator and 6 in the denominator? Use a two-tailed test and the 0.1 significance level. (Round your answer to 2 decimal places.)
F
2) Arbitron Media Research Inc. conducted a study of the iPod listening habits of men and women. One facet of the study involved the mean listening time. It was discovered that the mean listening time for men was 29 minutes per day. The standard deviation of the sample of the 10 men studied was 8 minutes per day. The mean listening time for the 12 women studied was also 29 minutes, but the standard deviation of the sample was 15 minutes. At the .10 significance level, can we conclude that there is a difference in the variation in the listening times for men and women? (Round your answer to 3 decimal places.)
The test statistic is . Decision
3) The following is sample information. Test the hypothesis that the treatment means are equal. Use the 0.01 significance level.
Treatment 1
Treatment 2
Treatment 3
7
4
4
4
5
7
6
5
6
6
4
5
(a)
State the null hypothesis and the alternate hypothesis.
H0
H1
(b)
What is the decision rule? (Round your answer to 2 decimal places.)
H0 if the test statistic is greater than .
(c&d)
Compute SST, SSE, and SS total and complete an ANOVA table. (Round SS, MS and F values to 3 decimal places.)
Source
SS
df
MS
F
Treatments
Error
Total
(e)
State your decision regarding the null hypothesis.
H0
4) A senior accounting major at Midsouth State University has job offers from four CPA firms. To explore the offers further, she asked a sample of recent trainees how many months each worked for the firm before receiving a raise in salary. The sample information is submitted to MINITAB with the following results:
Analysis of Variance
Source
DF
SS
MS
F
Factor
5
36.39
7.28
1.92
Error
12
45.54
3.80
Total
17
81.93
Reject if F > . (Round your answer to 2 .
Statistics 106Homework 6Due Feb. 17, 2016, In Class.docxdessiechisomjj4
Statistics 106
Homework 6
Due : Feb. 17, 2016, In Class
19.5
In a two-factor study, the treatment means µi j are as follows:
Factor B
Factor A B1 B2 B3 B4
A1 250 265 268 269
A2 288 273 270 269
a. Obtain the factor B main effects. What do your results imply about factor B?
b. Prepare a treatment means plot and determine whether the two factors interact. How can you tell that
interactions are present? Are the interactions important or unimportant?
c. Make a logarithmic transformation of the µi j and plot the transformed values to explore whether this
transformation is helpful in reducing the interactions. What are your findings?
19.8
Refer to Problem 19.5. Assume that σ = 4 and n = 6.
1. Obtain E{MS E} and E{MS AB}.
2. Is E{MS AB} substantially larger than E{MS E}? What is the implication of this?
19.12
Eye contact effect. In a study of the effect of applicant’s eye contact (factor A) and personnel officer’s
gender (factor B) on the personnel officer’s assessment of likely job success of applicant, 10 male and
10 female personnel officers were shown a front view photograph of an applicant’s face and were asked
to give the person in the photograph a success rating on a scale of 0 (total failure) to 20 (outstanding
success). Half of the officers in each gender group were chosen at random to receive a version of the
photograph in which the applicant made eye contact with the camera lens. The other half received a
version in which there was no eye contact. The success ratings are saved in ”CH19PR12.txt”.
a. Obtain the fitted values for ANOVA model.
b. Obtain the residuals. Do they sum to zero for each treatment?
c. Prepare aligned residual dot plots for the treatments. What departures from ANOVA model can be
studied from these plots? What are your findings?
d. Prepare a normal Q-Q plot of the residuals. Does the normality assumption appear to be reasonable
here?
1
e. The observations for each treatment were obtained in the order shown. Prepare residual sequence
plots and interpret them. What are your findings?
19.13
Refer to Eye contact effect Problem 19.12. Assume that ANOVA model is applicable.
a. Prepare an estimated treatment means plot. Does it appear that any factor effects are present? Ex-
plain.
b. Set up the analysis of variance table. Does anyone source account for most of the total variability in
the success ratings in the study? Explain.
c. Test whether or not interaction effects are present; use α = .01. State the alternatives, decision rule,
and conclusion. What is the P-value of the test?
d. Test whether or not eye contact and gender main effects are present. In each case, use α = .01 and
state the alternatives, decision rule, and conclusion. What is the P-value of each test?
e. Do the results in parts (c) and (d) confirm your graphic analysis in part (a)?
Additional Problem
Based on each of the following interaction plots, describe whether there are main effects for each factor
and whether there are intera.
Page 1 of 18Part A Multiple Choice (1–11)______1. Using.docxalfred4lewis58146
Page 1 of 18
Part A: Multiple Choice (1–11)
______1. Using the “eyeball” method, the regression line = 2+2x has been fitted to the data points (x = 2, y = 1), (x = 3, y = 8), and (x = 4, y = 7). The sum of the squared residuals will be
a. 7 b. 19 c. 34 d. 8
______2. A computer statistical package has included the following quantities in its output: SST = 50, SSR = 35, and SSE = 15. How much of the variation in y is explained by the regression equation?
a. 49% b. 70% c. 35% d. 15%
______3. In testing the significance of b, the null hypothesis is generally that
a. β = b b. β 0 c. β = 0 d. β = r
______4. Testing whether the slope of the population regression line could be zero is equivalent to testing whether the population _____________ could be zero.
a. standard error of estimate c. y-intercept
b. prediction interval d. coefficient of correlation
______5. A multiple regression equation includes 4 independent variables, and the coefficient of multiple determination is 0.64. How much of the variation in y is explained by the regression equation?
a. 80% b. 16% c. 32% d. 64%
______6. A multiple regression analysis results in the following values for the sum-of-squares terms: SST = 50.0, SSR = 35.0, and SSE = 15.0. The coefficient of multiple determination will be
a. = 0.35 b. = 0.30 c. = 0.70 d. = 0.50
______7. In testing the overall significance of a multiple regression equation in which there are three independent variables, the null hypothesis is
a. :
b. :
c. :
d. :
______8. In a multiple regression analysis involving 25 data points and 4 independent variables, the sum-of-squares terms are calculated as SSR = 120, SSE = 80, and SST = 200. In testing the overall significance of the regression equation, the calculated value of the test statistic will be
a. F = 1.5 c. F = 5.5
b. F = 2.5 d. F = 7.5
______9. For a set of 15 data points, a computer statistical package has found the multiple regression equation to be = -23 + 20+ 5 + 25 and has listed the t-ratio for testing the significance of each partial regression coefficient. Using the 0.05 level in testing whether = 20 differs significantly from zero, the critical t values will be
a. t = -1.960 and t= +1.960
b. t = -2.132 and t = +2.132
c. t = -2.201 and t = +2.201
d. t = -1.796 and t = +1.796
______10. Computer analyses typically provide a p-Value for each partial regression coefficient. In the case of , this is the probability that
a. = 0
b. =
c. the absolute value of could be this large if = 0
d. the absolute value of could be this large if 1
______11. In the multiple regression equation, = 20,000 + 0.05+ 4500 , is the estimated household income, is the amount of life insurance held by the head of the household, and is a dummy variable ( = 1 if the family owns mutual funds, 0 if it doesn’t). The interpretation of = 4500 is that
a. owing mutual funds increases the estimated income by $4500
b. the average value of a mut.
1. A law firm wants to determine the trend in its annual billings .docxmonicafrancis71118
1. A law firm wants to determine the trend in its annual billings so that it can better forecast revenues. It plots the data on its billings for the past 10 years and finds that the scatter plot appears to be linear. What formula should they use to determine the trend line?
σ = ∑√(x - μ)2 ÷ N
F = s12 ÷ s22
t = (x̄ - μx-bar) ÷ s/√n
Tt = b0 + b1t
3 points
QUESTION 2
1. A set of subjects, usually randomly sampled, selected to participate in a research study is called:
Population
Sample
Mode group
Partial selection
3 points
QUESTION 3
1. If a researcher accepts a null hypothesis when that hypothesis is actually true, she has committed:
a type I error
a type II error
no error
a causation
3 points
QUESTION 4
1. A binomial probability distribution is a discrete distribution (i.e., the x-variable is discrete).
True
False
3 points
QUESTION 5
1. The tdistribution is wider and flatter (i.e., has more variation) than the normal distribution.
True
False
3 points
QUESTION 6
1. A physician wants to estimate the average amount of time that patients spend in his waiting room. He asks his receptionist to record the waiting times for 28 of his patients and finds that the sample mean (x̄) is 37 minutes and the sample standard deviation (s) is 12 minutes. What formula would you use to construct the 95% confidence interval for the population mean of waiting times?
t = (x̄ - μx-bar) ÷ s/√n
µ = ∑ x ÷ N
x̄ - t(s ÷ √n) < µ < x̄ + t(s ÷ √n)
z = (x - µ) ÷ σ
3 points
QUESTION 7
1. When the alternative hypothesis states that the difference between two groups can only be in one direction, we call this a:
One-tailed test
Bi-directional test
Two-tailed test
Non-parametric test
3 points
QUESTION 8
1. For any probability distribution, the probability of any x-value occurring within any given range is equal to the area under the distribution and above that range.
True
False
3 points
QUESTION 9
1. The formula for ____________ is (Row total X Column total)/T
Observed frequencies
Degrees of freedom
Expected frequencies
Sampling error
3 points
QUESTION 10
1. State Senator Hanna Rowe has ordered an investigation of the large number of boating accidents that have occurred in the state in recent summers. Acting on her instructions, her aide, Geoff Spencer, has randomly selected 9 summer months within the last few years and has compiled data on the number of boating accidents that occurred during each of these months. The mean number of boating accidents to occur in these 9 months was 31 (x̄), and the standard deviation (s) in this sample was 9 boating accidents per month. Geoff was told to construct a 90% confidence interval for the true mean number of boating accidents per month. What formula should Geoff use?
x̄ - t(s ÷ √n) < µ < x̄ + t(s ÷ √n)
F = s12 ÷ s22
z = (x - µ) ÷ σ
x̄ - z(σ ÷ √n) < µ < x̄ + z(σ ÷ √n)
.
3. One Way Anova PROBLEM 1 The following is the sales revenue of three salesmen taken on five occasions. Test at 5% level whether there is any difference between the performances of three salesmen. Observations A B C 1 5 3 10 2 6 5 13 3 8 2 7 4 1 10 13 5 5 0 17
4. Reproduce the table with Column Total SS (Treatment) Observations A B C 1 5 3 10 2 6 5 13 3 8 2 7 4 1 10 13 5 5 0 17 TOTAL 25 20 60
5. Reproduce the table with sum of squares SS (Total) Observations A Y a 2 B Y b 2 C Y c 2 1 5 25 3 9 10 100 2 6 36 5 25 13 169 3 8 64 2 4 7 49 4 1 1 10 100 13 169 5 5 25 0 0 17 289 TOTAL 25 151 20 138 60 776
6. Find the Correction Factor Y 2 /N (A + B + C ) 2 / N SS (Total) = Y a 2 + Y b 2 + Y c 2 _ Y 2 / N SS (Treatment) = A/ CN + B/ CN + C/ CN - Y 2 / N SS (Error) = SS (Total) - SS (Treatment)
7. ANOVA TABLE Sources of Variances S S D F Mean Computed F Tabulated F C-1 N-C N-1
8. PROBLEM 2 A company has deputed A, B, C and D to four different training programmes. Check whether there is any significant difference between the training programmes of A, B, C, and D at 5% level of significance. Employees A B C D 1 80 70 65 90 2 90 60 50 89 3 96 55 58 85 4 85 85 55 95 5 70 90 40 80