low/med/hi categories are automatically generated by Modgraph when you fill in its data entry table. You enter B, mean and SD of the centered scores of temp and humidity (Table 7, 8). To get the interaction effects of humidity, you put temp as main effect var, humidity as moderator; to get the interaction effects of temp, you put the values of humidity as main effect var, temp as moderator.
could you please tell me how did you plot the figures in page no:9 of your multiple regression procedure. As the temperature and humidy are continous, how did you arrive the levels like Low, Medium and High for those.
Transcript of "Multiple Regression worked example (July 2014 updated)"
1.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 1
QUANTITATIVE RESEARCH METHODS
SAMPLE OF
REGRESSION ANALYSIS
Prepared by
Michael Ling
2.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 2
PROBLEM
Create a multiple regression model to predict the level of daily ice-cream sales Mr Whippy can ex
pect to make, given the daily temperature and humidity. Using the base model (50 marks):
• What is the regression model and regression equation?
• What interpretation do you make of the findings?
• Is the regression model valid?
• Is the sample size adequate?
Create an interaction term for temperature and humidity:
• Is there an interaction effect in the model?
• What is the effect size (F
2
) of the interaction?
• What interpretation do you make of the findings?
• Show the interaction effect graphically (e.g., using ModGraph)
SOLUTION
Base Model
The regression model is Sales = a + b*temperature + c*humidity + e where Sales is the
criterion variable, temperature and humidity are predictor; a is intercept crosses the Sales axis;
b and c are regression coefficients; e is an error term. The regression equation is Sales = -24.112
+ 3.513*temperature + 7.589*humidity (Table 1).
Since R2
=.629, 62.9% of the variance in ice-cream sales can be explained by temperature
and humidity (Table 2). Compared to R2
, adjusted R2
provides a less biased estimate (60.9%) of
the extent of the relationship between the variables in the population.
The ANOVA is significant (F=31.397, df(regression)=2, df(residual)=37, Sig < .001 )
which means that the two predictors collectively account for a statistically significant proportion
of the variance in the criterion variable (Table 3).
The B weight for temperature is 3.513, which means that, after controlling for humidity,
a 1-unit increase in temperature will result in a predicted 3.513 unit increase in ice-cream sales.
The B weight for humidity is 7.589, which means that, after controlling for temperature, a 1-unit
3.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 3
increase in temperature will result in a predicted 7.589 unit increase in ice-cream sales (Table 1).
The standardized coefficient (Beta) for temperature is .712, which means, after controlling for
humidity, a 1 standard deviation (SD) increase in temperature will result in a .712 SD increase in
ice-cream sales. Similarly, a 1 SD increase in humidity will result in a .229 SD increase in ice-
cream sales (Table 1). Temperature can account for a significant proportion of unique variance
in ice-cream sales (t=6.943, Sig < .001) (Table 1). Humidity accounts for a significant
proportion of unique variance in ice-cream sales (t=2.238, Sig < 0.05) (Table 1). The Pearson’s
correlation between temperature and ice-cream sales is r = .761, and that between humidity and
ice-cream sales is r = .382 (Table 1).
The partial correlation between temperature and ice-cream sales is .752 and that between
humidity and ice-cream sales is .345 (Table 1). The part correlation (sr) for temperature is .695,
indicating that approximately 48.3% (.6952
) of the variance in ice-cream sales can be uniquely
attributed to temperature (Table 1). Similarly, approximately 5% (.2242
) of the variance in ice-
cream sales can be uniquely attributed to humidity (Table 1).
The Variance Inflation Factors (VIF) of temperature and humidity are both 1.048. As
they are both close to 1, multicollinearity is not a problem. From the normal P-P plot, the points
are clustered tightly along the diagonal and hence the residuals are normally distributed (Figure
1). The absence of any clear patterns in the spread of points in the scatterplot indicates that the
assumptions of normality, linearity and homoscedasticity of residuals are met (Figure 2).
Using G*Power and setting alpha = .05 (two-tailed), power = 0.8 and 2 predictors, the
results of sample sizes are shown in Table A. As there are 40 samples in this dataset, the effect
size is approximately .25 and hence samples are adequate to detect a medium-to-large effect.
Interaction Model
4.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 4
The ANOVA is significant (F=40.819, df(regression)=3, df(residual)=36, Sig < .001)
which indicates that the interaction model is statistically significant (Table 4). Since R2
=.773,
77.3% of the variance in ice-cream sales can be explained by the interaction model with the
interaction effect, which is14.4% improvement over the base model (Table 5).
The regression equation is Sales = 257.096 – 6.976*temperature – 76.825*humidity +
3.123*temperature*humidity (Table 6). Temperature can account for a significant proportion
of unique variance in ice-cream sales (t=-3.121, Sig < .005) (Table 6). Humidity accounts for a
significant proportion of unique variance in ice-cream sales (t=-4.292, Sig < .001) (Table 6).
The interaction variable can account for a significant proportion of unique variance in ice-cream
sales (t=4.770, Sig < .001) (Table 6). The partial correlation between temperature and ice-
cream sales is -.461 and that between humidity and ice-cream sales is -.582 (Table 6). The part
correlation (sr) for temperature is reduced to -.248, indicating that approximately 6.2% (.2482
) of
the variance in ice-cream sales can be uniquely attributed to temperature (Table 6).
Approximately 11.6% (.3412
) of the variance in ice-cream sales can be uniquely attributed to
humidity (Table 6), and approximately 14.3% (.3792
) of the variance in ice-cream sales can be
uniquely attributed to the interaction variable (Table 6). The effect size of the interaction (F2) =
(.7732
- .6292
) / (1 - .7732
) = .502. Since it is greater than .35, the result is a large effect.
The use of VIFs to interpret multicollinearity in a regression model that has interaction
effects is erroneous with uncentered variables [1]. As a result, the moderating effect is examined
by applying ModGraph[2] on centered scores. The centered scores of the interaction model are
the zscores (Table 7 and Table 8). Two ModGraphs are plotted where one examines the
moderating relationship when temperature is the main effect (Figure 3) and the other examines
moderating relationship when humidity is the main effect (Figure 4).
Referring to Figure 3, ice-cream sales is directly proportional to temperature only when
humidity is high, ice-cream sales is inversely proportional to temperature when humidity is both
5.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 5
medium and low. Thus, humidity moderates the relationship between ice-cream sale and
temperature. Referring to Figure 4, ice-cream sales is directly proportional to humidity only
when temperature is high, ice-cream sales is inversely proportional to humidity when
temperature is both medium and low. Thus, temperature moderates the relationship between ice-
cream sale and humidity.
References:
1. Robinson, C. & Schumacker, R. E. (2009). Interaction Effects: Centering, Variance Inflation Factor, and
Interpretation Issues. Multiple Linear Regression Viewpoints, 35 (1), 6-11.
2. http://www.victoria.ac.nz/psyc/paul-jose-files/modgraph/modgraph.php
6.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 6
Appendix
Table 1: Base Model - Coefficients
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig.
95.0% Confidence
Interval for B Correlations
B Std. Error Beta
Lower
Bound
Upper
Bound
Zero-
order Partial Part
1 (Constant) -24.112 15.933 -1.513 .139 -56.394 8.171
temperature 3.513 .506 .712 6.943 .000 2.488 4.538 .761 .752 .695
humidity 7.589 3.392 .229 2.238 .031 .717 14.461 .382 .345 .224
a. Dependent Variable: sales
Model
Collinearity Statistics
Tolerance VIF
1 (Constant)
temperature .954 1.048
humidity .954 1.048
Table 2: Base Model Summaryb
Model R R Square
Adjusted R
Square
Std. Error of the
Estimate
1 .793a
.629 .609 14.977
a. Predictors: (Constant), humidity, temparature
b. Dependent Variable: sales
Table 3: Base Model - ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 14084.540 2 7042.270 31.397 .000a
Residual 8299.060 37 224.299
Total 22383.600 39
a. Predictors: (Constant), humidity, temparature
b. Dependent Variable: sales
Table A: Results of G*Power
Effect Size .35 .25 .15
Sample Size 28 42 66
7.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 7
Figure 1: Normal P-P Plot
Figure 2: Scatterplot
8.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 8
Table 4: ANOVA (Interaction Model)b
Model Sum of Squares df Mean Square F Sig.
1 Regression 17298.244 3 5766.081 40.819 .000a
Residual 5085.356 36 141.260
Total 22383.600 39
a. Predictors: (Constant), temp_humidity, temperature, humidity
b. Dependent Variable: sales
Model
Collinearity Statistics
Tolerance VIF
1 (Constant)
temperature .954 1.048
humidity .954 1.048
Table 5: Model Summary (Interaction Model)b
Model R R Square
Adjusted R
Square
Std. Error of the
Estimate
1 .879a
.773 .754 11.885
a. Predictors: (Constant), temp_humidity, temperature, humidity
b. Dependent Variable: sales
9.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 9
Table 6: Coefficients (Interaction Model)a
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig.
95.0% Confidence
Interval for B Correlations
B
Std.
Error Beta
Lower
Bound
Upper
Bound
Zero-
order Partial Part
1 (Constant) 257.096 60.297 4.264 .000 134.807 379.384
temperature -6.976 2.235 -1.413 -3.121 .004 -11.510 -2.443 .761 -.461 -.248
humidity -76.825 17.901 -2.322 -4.292 .000 -113.130 -40.519 .382 -.582 -.341
temp_humidity 3.123 .655 3.674 4.770 .000 1.795 4.451 .745 .622 .379
a. Dependent Variable: sales
Table 7: Model Summary (Interaction Model)
Model R
R
Square
Adjusted R
Square
Std. Error of
the Estimate
Change Statistics
R Square
Change F Change df1 df2
Sig. F
Change
1 .793a
.629 .609 14.977 .629 31.397 2 37 .000
2 .879b
.773 .754 11.885 .144 22.750 1 36 .000
a. Predictors: (Constant), Zscore(humidity), Zscore(temparature)
b. Predictors: (Constant), Zscore(humidity), Zscore(temperature), Zscore(temp_humidity)
c. Dependent Variable: sales
Table 8: Coefficients (Interaction Model)
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig.
95.0% Confidence
Interval for B Correlations
B Std. Error Beta
Lower
Bound
Upper
Bound
Zero-
order Partial Part
1 (Constant) 96.100 2.368 40.583 .000 91.302 100.898
Zscore(temparature) 17.049 2.456 .712 6.943 .000 12.073 22.024 .761 .752 .695
Zscore(humidity) 5.495 2.456 .229 2.238 .031 .519 10.470 .382 .345 .224
2 (Constant) 96.100 1.879 51.138 .000 92.289 99.911
Zscore(temparature) -33.860 10.850 -1.413 -3.121 .004 -55.864 -11.855 .761 -.461 -.248
Zscore(humidity) -55.623 12.961 -2.322 -4.292 .000 -81.909 -29.337 .382 -.582 -.341
Zscore(temp_humidity) 88.020 18.454 3.674 4.770 .000 50.594 125.446 .745 .622 .379
a. Dependent Variable: sales
10.
REGRESSION ANALYSIS July 2014 updated
Prepared by Michael Ling Page 10
Figure 3: ModGraph 1 – zscore(temp) as main effect, zscore(humidity) as
moderator, zscore(temp*humidity) as interaction variable
Figure 4: ModGraph 1 – zscore(humidity) as main effect, zscore(temperature) as
moderator, zscore(temp*humidity) as interaction variable
-50.00
0.00
50.00
100.00
150.00
200.00
250.00
300.00
low med high
SaleofIce-cream
Temperature
Temperature and Humidity
Humidity
high
med
low
Grade
Humidity
Temperature and Humidity
Temperature
high
med
low