This document provides an overview and outline of regression models and forecasting techniques. It discusses simple and multiple linear regression analysis, how to measure the fit of regression models, assumptions of regression models, and testing models for significance. The goals are to help students understand relationships between variables, predict variable values, develop regression equations from sample data, and properly apply and interpret regression analysis.
Multiple Linear Regression II and ANOVA IJames Neill
Explains advanced use of multiple linear regression, including residuals, interactions and analysis of change, then introduces the principles of ANOVA starting with explanation of t-tests.
Regression analysis is a powerful statistical method that allows you to examine the relationship between two or more variables of interest. Regression analysis is a reliable method of identifying which variables have impact on a topic of interest. The process of performing a regression allows you to confidently determine which factors matter most, which factors can be ignored, and how these factors influence each other.In this presentation a brief introduction about SLR and MLR and their codes in R are described
Multiple Linear Regression II and ANOVA IJames Neill
Explains advanced use of multiple linear regression, including residuals, interactions and analysis of change, then introduces the principles of ANOVA starting with explanation of t-tests.
Regression analysis is a powerful statistical method that allows you to examine the relationship between two or more variables of interest. Regression analysis is a reliable method of identifying which variables have impact on a topic of interest. The process of performing a regression allows you to confidently determine which factors matter most, which factors can be ignored, and how these factors influence each other.In this presentation a brief introduction about SLR and MLR and their codes in R are described
Scatter diagrams, strong and weak correlation, positive and negative correlation, lines of best fit, extrapolation and interpolation. Aimed at UK level 2 students on Access and GCSE Maths courses.
An introduction to logistic regression for physicians, public health students and other health workers. Logistic regression is a way to look at effect of a numeric independent variable on a binary (yes-no) dependent variable. For example, you can analyze or model the effect of birth weight on survival.
This Presentation course will help you in understanding the Machine Learning model i.e. Generalized Linear Models for classification and regression with an intuitive approach of presenting the core concepts
this presentation defines basics of regression analysis for students and scholars. uses, objectives, types of regression, use of spss for regression and various tools available in the market to calculate regression analysis
Introduces and explains the use of multiple linear regression, a multivariate correlational statistical technique. For more info, see the lecture page at http://goo.gl/CeBsv. See also the slides for the MLR II lecture http://www.slideshare.net/jtneill/multiple-linear-regression-ii
Scatter diagrams, strong and weak correlation, positive and negative correlation, lines of best fit, extrapolation and interpolation. Aimed at UK level 2 students on Access and GCSE Maths courses.
An introduction to logistic regression for physicians, public health students and other health workers. Logistic regression is a way to look at effect of a numeric independent variable on a binary (yes-no) dependent variable. For example, you can analyze or model the effect of birth weight on survival.
This Presentation course will help you in understanding the Machine Learning model i.e. Generalized Linear Models for classification and regression with an intuitive approach of presenting the core concepts
this presentation defines basics of regression analysis for students and scholars. uses, objectives, types of regression, use of spss for regression and various tools available in the market to calculate regression analysis
Introduces and explains the use of multiple linear regression, a multivariate correlational statistical technique. For more info, see the lecture page at http://goo.gl/CeBsv. See also the slides for the MLR II lecture http://www.slideshare.net/jtneill/multiple-linear-regression-ii
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 10: Correlation and Regression
10.2: Regression
Understanding User Needs and Satisfying ThemAggregage
https://www.productmanagementtoday.com/frs/26903918/understanding-user-needs-and-satisfying-them
We know we want to create products which our customers find to be valuable. Whether we label it as customer-centric or product-led depends on how long we've been doing product management. There are three challenges we face when doing this. The obvious challenge is figuring out what our users need; the non-obvious challenges are in creating a shared understanding of those needs and in sensing if what we're doing is meeting those needs.
In this webinar, we won't focus on the research methods for discovering user-needs. We will focus on synthesis of the needs we discover, communication and alignment tools, and how we operationalize addressing those needs.
Industry expert Scott Sehlhorst will:
• Introduce a taxonomy for user goals with real world examples
• Present the Onion Diagram, a tool for contextualizing task-level goals
• Illustrate how customer journey maps capture activity-level and task-level goals
• Demonstrate the best approach to selection and prioritization of user-goals to address
• Highlight the crucial benchmarks, observable changes, in ensuring fulfillment of customer needs
Personal Brand Statement:
As an Army veteran dedicated to lifelong learning, I bring a disciplined, strategic mindset to my pursuits. I am constantly expanding my knowledge to innovate and lead effectively. My journey is driven by a commitment to excellence, and to make a meaningful impact in the world.
An introduction to the cryptocurrency investment platform Binance Savings.Any kyc Account
Learn how to use Binance Savings to expand your bitcoin holdings. Discover how to maximize your earnings on one of the most reliable cryptocurrency exchange platforms, as well as how to earn interest on your cryptocurrency holdings and the various savings choices available.
VAT Registration Outlined In UAE: Benefits and Requirementsuae taxgpt
Vat Registration is a legal obligation for businesses meeting the threshold requirement, helping companies avoid fines and ramifications. Contact now!
https://viralsocialtrends.com/vat-registration-outlined-in-uae/
Building Your Employer Brand with Social MediaLuanWise
Presented at The Global HR Summit, 6th June 2024
In this keynote, Luan Wise will provide invaluable insights to elevate your employer brand on social media platforms including LinkedIn, Facebook, Instagram, X (formerly Twitter) and TikTok. You'll learn how compelling content can authentically showcase your company culture, values, and employee experiences to support your talent acquisition and retention objectives. Additionally, you'll understand the power of employee advocacy to amplify reach and engagement – helping to position your organization as an employer of choice in today's competitive talent landscape.
3.0 Project 2_ Developing My Brand Identity Kit.pptxtanyjahb
A personal brand exploration presentation summarizes an individual's unique qualities and goals, covering strengths, values, passions, and target audience. It helps individuals understand what makes them stand out, their desired image, and how they aim to achieve it.
3.0 Project 2_ Developing My Brand Identity Kit.pptx
Bba 3274 qm week 6 part 1 regression models
1. BBA3274 / DBS1084 QUANTITATIVE METHODS for BUSINESS
Forecasting and
Forecasting and
Regression Models
Regression Models
Part 1
Part 1
by
Stephen Ong
Visiting Fellow, Birmingham City
University Business School, UK
Visiting Professor, Shenzhen
3. Learning Objectives
After completing this lecture, students will be able to:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Identify variables and use them in a regression model.
Develop simple linear regression equations. from sample data and
interpret the slope and intercept.
Compute the coefficient of determination and the coefficient of
correlation and interpret their meanings.
Interpret the F-test in a linear regression model.
List the assumptions used in regression and use residual plots to
identify problems.
Develop a multiple regression model and use it for prediction
purposes.
Use dummy variables to model categorical data.
Determine which variables should be included in a multiple
regression model.
Transform a nonlinear function into a linear one for use in regression.
Understand and avoid common mistakes made in the use of
regression analysis.
4. Regression Models : Outline
4.1 Introduction
4.2 Scatter Diagrams
4.3 Simple Linear Regression
4.4 Measuring the Fit of the Regression Model
4.5 Using Computer Software for Regression
4.6 Assumptions of the Regression Model
4.7 Testing the Model for Significance
4.8 Multiple Regression Analysis
4.9 Binary or Dummy Variables
4.10 Model Building
4.11 Nonlinear Regression
4.12 Cautions and Pitfalls in Regression Analysis
6. Introduction
Regression analysis is a very valuable tool
for a manager.
Regression can be used to:
Understand the relationship between variables.
Predict the value of one variable based on
another variable.
Simple linear regression models have only
two variables.
Multiple regression models have more
variables.
7. Introduction
The variable to be predicted is called
the dependent variable.
This is sometimes called the response
variable.
The value of this variable depends on
the value of the independent variable.
This is sometimes called the explanatory
or predictor variable.
Dependent
variable
Independent
=
+
variable
Independent
variable
8. Scatter Diagram
A scatter diagram or scatter plot
is often used to investigate the
relationship between variables.
The independent variable is
normally plotted on the X axis.
The dependent variable is
normally plotted on the Y axis.
4-8
9. Triple A Construction
Triple A Construction renovates old homes.
Managers have found that the dollar volume
of renovation work is dependent on the area
payroll.
TRIPLE A’S
SALES
($100,000s)
6
8
9
5
4.5
Table 4.1
9.5
LOCAL PAYROLL
($100,000,000s)
3
4
6
4
2
5
11. Simple Linear Regression
Regression models are used to test if there is
a relationship between variables.
There is some random error that cannot be
predicted.
Y = β0 + β1 X + ε
where
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
β 0 = intercept (value of Y when X = 0)
β 1 = slope of the regression line
ε = random error
12. Simple Linear Regression
True values for the slope and intercept
are not known so they are estimated
using sample data.
ˆ
Y = b0 +b1 X
^
where
Y = predicted value of Y
b0 = estimate of β0, based on sample results
b1 = estimate of β1, based on sample results
13. Triple A Construction
Triple A Construction is trying to
predict sales based on area payroll.
Y = Sales
X = Area payroll
The line chosen in Figure 4.1 is the one
that minimizes the errors.
Error = (Actual value) – (Predicted value)
ˆ
e = Y −Y
14. Triple A Construction
For the simple linear regression model, the
values of the intercept and slope can be
calculated using the formulas below.
ˆ
Y =b0 +b1 X
∑ X = average (mean) of X values
X=
n
∑ Y = average (mean) of Y values
Y=
n
b1
(
∑ X − X )(Y −Y )
=
(
∑ X −X )
2
b0 = Y − b1 X
16. Triple A Construction
Regression calculations
∑ X = 24 = 4
X=
6
6
∑ Y = 42 = 7
Y=
6
b1
6
∑ ( X − X )(Y − Y ) = 12.5 = 1.25
=
10
∑(X − X )
2
b0 = Y − b1 X = 7 − (1.25 )( 4 ) = 2
Therefore ˆ
Y
= 2 + 1.25 X
17. Triple A Construction
Regression calculations
∑ X = 24 = 4
X=
6
6
sales = 2 + 1.25(payroll)
∑ Y = 42 = 7 If the payroll next year is
Y=
6
b1
$600 million
6
ˆ
+ .5 (
Y
∑ ( X − X )(Y −Y )==2121.251.6) = 9.5 or $ 950,000
=
= 25
10
∑(X − X )
2
b0 = Y − b1 X = 7 − (1.25 )( 4 ) = 2
Therefore
ˆ
Y = 2 + 1.25 X
18. Measuring the Fit
of the Regression Model
Regression models can be developed for
any variables X and Y.
How do we know the model is actually
helpful in predicting Y based on X?
We could just take the average error, but the positive and
negative errors would cancel each other out.
Three measures of variability are:
SST – Total variability about the mean.
SSE – Variability about the regression line.
SSR – Total variability that is explained by the model.
19. Measuring the Fit
of the Regression Model
Sum of the squares total :
SST = ∑ (Y − Y )2
Sum of the squared error:
ˆ
SSE = ∑ e 2 = ∑ (Y − Y )2
Sum of squares due to regression:
ˆ
SSR = ∑ (Y − Y )2
SST = SSR + SSE
20. Measuring the Fit
of the Regression Model
Sum of Squares for Triple A Construction
X
(Y – Y )2
Y
^
(Y – Y )2
(Y – Y )2
6
3
(6 – 7)2 = 1
2 + 1.25(3) = 5.75
0.0625
1.563
8
4
(8 – 7)2 = 1
2 + 1.25(4) = 7.00
1
0
9
6
(9 – 7)2 = 4
2 + 1.25(6) = 9.50
0.25
6.25
5
4
(5 – 7)2 = 4
2 + 1.25(4) = 7.00
4
0
4.5
2
(4.5 – 7)2 = 6.25
2 + 1.25(2) = 4.50
0
6.25
9.5
5
(9.5 – 7)2 = 6.25
2 + 1.25(5) = 8.25
1.5625
1.563
Y
∑(Y – Y)2 = 22.5
Y=7
^
^
∑(Y – Y)2 = 6.875
SST = 22.5
SSE = 6.875
Table 4.3
^
^
∑(Y – Y )2 =
15.625
SSR = 15.625
21. Measuring the Fit
of the Regression Model
Sum of the squares total
For Triple A Construction
2
SST = ∑ (Y − Y )
SST = 22.5
Sum of the squared error
SSE 2= 6.875
ˆ
SSE = ∑ e 2 = ∑ (Y − Y )
SSR = 15.625
Sum of squares due to regression
ˆ
SSR = ∑ (Y − Y )2
An important relationship
SST = SSR + SSE
22. Measuring the Fit
of the Regression Model
Deviations from the Regression Line and from the Mean
Figure 4.2
23. Coefficient of Determination
The proportion of the variability in Y explained by
the regression equation is called the coefficient
of determination.
The coefficient of determination is r2.
SSR
SSE
2
r =
= 1−
SST
SST
15.625
r2 =
=0.6944
22.5
About 69% of the variability in Y is explained by
the equation based on payroll (X).
24. Correlation Coefficient
The correlation coefficient is an expression of the
strength of the linear relationship.
It will always be between +1 and –1.
The correlation coefficient is r.
r =
r
2
For Triple A Construction:
r = 0.6944 = 0.8333
4-24
25. Four Values of the
Correlation Coefficient
Y
*
Y
*
* *
* *
** *
* *
* *
*
*
Y
*
(a) Perfect Positive X
Correlation:
r = +1
* **
* * **
*
* *** *
Figure 4.3
(c)
No
Correlation:
r=0
X
Y
(b) Positive
Correlation:
0<r<1
*
*
*
*
X
*
(d) Perfect
Negative
Correlation:
r = –1
X
26. Using Computer Software for
Regression
Accessing the Regression Option in Excel 2010
Program 4.1A
4-26
28. Using Computer Software for
Regression
Excel Output for the Triple A Construction Example
Program 4.1C
4-28
29. Assumptions of the
Regression Model
If we make certain assumptions about the errors in a
regression model, we can perform statistical tests to
determine if the model is useful.
1.
2.
3.
4.
Errors are independent.
Errors are normally distributed.
Errors have a mean of zero.
Errors have a constant variance.
A plot of the residuals (errors) will
often highlight any glaring violations
of the assumption.
4-29
33. Estimating the Variance
Errors are assumed to have a constant
variance (σ 2), but we usually don’t know
this.
It can be estimated using the mean
squared error (MSE), s2.
SSE
s = MSE =
n − k −1
2
where
n = number of observations in the sample
k = number of independent variables
34. Estimating the Variance
For Triple A Construction:
SSE
6.8750 6.8750
s = MSE =
=
=
= 1.7188
n − k − 1 6 − 1− 1
4
2
We can estimate the standard deviation, s.
This is also called the standard error of the
estimate or the standard deviation of the
regression.
s = MSE = 1.7188 = 1.31
4-34
35. Testing the Model for
Significance
When the sample size is too small, you
can get good values for MSE and r2
even if there is no relationship between
the variables.
Testing the model for significance
helps determine if the values are
meaningful.
We do this by performing a statistical
hypothesis test.
4-35
36. Testing the Model for Significance
We start with the general linear
model
Y = β 0 + β1X + ε
If β 1 = 0, the null hypothesis is that there is
no relationship between X and Y.
The alternate hypothesis is that there is a
linear relationship (β 1 ≠ 0).
If the null hypothesis can be rejected, we
have proven there is a relationship.
We use the F statistic for this test.
37. Testing the Model for
Significance
The F statistic is based on the MSE and
SSR
MSR:
MSR =
where
k
k = number of independent variables in the model
The F statistic is:
MSR
F=
MSE
This describes an F distribution with:
degrees of freedom for the numerator = df1 = k
degrees of freedom for the denominator = df2 = n – k – 1
4-37
38. Testing the Model for Significance
If there is very little error, the MSE would be
small and the F-statistic would be large
indicating the model is useful.
If the F-statistic is large, the significance
level (p-value) will be low, indicating it is
unlikely this would have occurred by
chance.
So when the F-value is large, we can reject
the null hypothesis and accept that there is a
linear relationship between X and Y and the
values of the MSE and r2 are meaningful.
4-38
39. Steps in a Hypothesis Test
1.
Specify null and alternative
hypotheses:
H : β =0
0
1
H 1 : β1 ≠ 0
2. Select the level of significance (α ).
Common values are 0.01 and 0.05.
3. Calculate the value of the test statistic
using the formula:
MSR
F=
MSE
40. Steps in a Hypothesis Test
4.
Make a decision using one of the
following methods:
a) Reject the null hypothesis if the test statistic is greater than
the F-value from the table in Appendix D. Otherwise, do not
reject the null hypothesis:
Reject if Fcalculated > Fα ,df1 ,df 2
df 1 = k
df 2 = n − k − 1
b) Reject the null hypothesis if the observed significance
level, or p-value, is less than the level of significance
(α ). Otherwise, do not reject the null hypothesis:
p - value = P ( F > calculated test statistic )
Reject if p - value <α
41. Triple A Construction
Step 1.
Step 2.
H0 : β 1 = 0
(no linear relationship
between X and Y)
H1 : β 1 ≠ 0
(linear relationship exists
between X and Y)
Select α = 0.05
Step 3.
Calculate the value of the
SSR 15.6250
test statistic. = 15.6250
MSR =
=
k
1
MSR 15.6250
F=
=
= 9.09
MSE 1.7188
42. Triple A Construction
Step 4.
Reject the null hypothesis if the test statistic
is greater than the F-value in Appendix D.
df1 = k = 1
df2 = n – k – 1 = 6 – 1 – 1 = 4
The value of F associated with a 5% level of
significance and with degrees of freedom 1 and 4 is
found in Appendix D.
F0.05,1,4 = 7.71
Fcalculated = 9.09
Reject H0 because 9.09 > 7.71
43. Triple A Construction
We can conclude there is a
statistically significant
relationship between X and Y.
The r2 value of 0.69 means about
69% of the variability in sales (Y)
is explained by local payroll (X).
0.05
F = 7.71
Figure 4.5
9.09
44. Analysis of Variance
(ANOVA) Table
When software is used to develop a regression
model, an ANOVA table is typically created that
shows the observed significance level (p-value) for
the calculated F value.
This can be compared to the level of significance
(α ) to make a decision.
DF
SS
MS
Regression k
SSR
MSR = SSR/k
Residual
n-k-1
SSE
n-1
SIGNIFICANCE
MSE =
SSE/(n - k - 1)
Total
F
SST
Table 4.4
MSR/MSE P(F >
MSR/MSE)
4-44
45. ANOVA for Triple A Construction
Program 4.1C
(partial)
P(F > 9.0909) = 0.0394
Because this probability is less than 0.05, we reject
the null hypothesis of no linear relationship and
conclude there is a linear relationship between X
and Y.
4-45
46. Multiple Regression Analysis
Multiple regression models are extensions
to the simple linear model and allow the
creation of models with more than one
independent variable.
Y = β 0 + β 1 X1 + β 2X2 + … + β k Xk + ε
where
Y=
dependent variable (response variable)
Xi =
ith independent variable (predictor or explanatory
variable)
β 0 = intercept (value of Y when all Xi = 0)
βi =
coefficient of the ith independent variable
k=
number of independent variables
4-46
47. Multiple Regression Analysis
To estimate these values, a sample is
taken the following equation developed
ˆ
Y = b0 + b1 X 1 + b2 X 2 + ... + bk X k
where
ˆ
Y = predicted value of Y
b0 = sample intercept (and is an estimate of
β 0)
bi = sample coefficient of the ith variable (and
is an estimate of β i)
48. Jenny Wilson Realty
Jenny Wilson wants to develop a model to
determine the suggested listing price for
houses based on the size and age of the
house.
ˆ
Y = + X + X
b
b
b
0
where
ˆ
Y
1
1
2
2
=
predicted value of dependent variable
(selling price)
b0 =
Y intercept
X1 and X2 =
value of the two independent
variables (square footage and age) respectively
b1 and b2 = slopes for X1 and X2 respectively
She selects a sample of houses that have sold
recently and records the data shown in Table 4.5
4-48
49. Jenny Wilson Real Estate Data
Table 4.5
SELLING
PRICE ($)
95,000
119,000
124,800
135,000
142,000
145,000
159,000
165,000
182,000
183,000
200,000
211,000
215,000
219,000
SQUARE
FOOTAGE
1,926
2,069
1,720
1,396
1,706
1,847
1,950
2,323
2,285
3,752
2,300
2,525
3,800
1,740
AGE
30
40
30
15
32
38
27
30
26
35
18
17
40
12
CONDITION
Good
Excellent
Excellent
Good
Mint
Mint
Mint
Excellent
Mint
Good
Good
Good
Excellent
Mint
4-49
50. Jenny Wilson Realty
Input Screen for the Jenny Wilson
Realty Multiple Regression Example
Program 4.2A
4-50
52. Evaluating Multiple
Regression Models
Evaluation is similar to simple linear
regression models.
The p-value for the F-test and r2 are
interpreted the same.
The hypothesis is different because there is
more than one independent variable.
The F-test is investigating whether all
the coefficients are equal to 0 at the same
time.
53. Evaluating Multiple
Regression Models
To determine which independent
variables are significant, tests are
performed for each variable.
H0 : β =
0
1
H1 : β ≠
0
1
The test statistic is calculated and if the
p-value is lower than the level of
significance (α ), the null hypothesis is
rejected.
54. Jenny Wilson Realty
The model is statistically significant
The p-value for the F-test is 0.002.
r2 = 0.6719 so the model explains about 67% of the
variation in selling price (Y).
But the F-test is for the entire model and we can’t tell if
one or both of the independent variables are significant.
By calculating the p-value of each variable, we can
assess the significance of the individual variables.
Since the p-value for X1 (square footage) and X2 (age)
are both less than the significance level of 0.05, both
null hypotheses can be rejected.
4-54
55. Binary or Dummy Variables
Binary (or dummy or indicator)
variables are special variables
created for qualitative data.
A dummy variable is assigned a
value of 1 if a particular condition
is met and a value of 0 otherwise.
The number of dummy variables
must equal one less than the
number of categories of the
qualitative variable.
56. Jenny Wilson Realty
Jenny believes a better model can be
developed if she includes information
about the condition of the property.
X3 = 1 if house is in excellent condition
= 0 otherwise
X4 = 1 if house is in mint condition
= 0 otherwise
Two dummy variables are used to describe the
three categories of condition.
No variable is needed for “good” condition
since if both X3 and X4 = 0, the house must be in
good condition.
4-56
57. Jenny Wilson Realty
Input Screen for the Jenny Wilson Realty
Example with Dummy Variables
Program 4.3A
4-57
59. Model Building
The best model is a statistically
significant model with a high r2 and
few variables.
As more variables are added to the
model, the r2-value usually increases.
For this reason, the adjusted r2 value
is often used to determine the
usefulness of an additional variable.
The adjusted r2 takes into account the
number of independent variables in
the model.
60. Model Building
The formula for r2
r2 =
SSR
SSE
= 1−
SST
SST
The formula for adjusted r2
SSE /( n − k − 1)
Adjusted r = 1 −
SST /( n − 1)
2
As the number of variables increases, the
adjusted r2 gets smaller unless the increase
due to the new variable is large enough to
offset the change in k.
61. Model Building
In general, if a new variable increases the
adjusted r2, it should probably be included in the
model.
In some cases, variables contain duplicate
information.
When two independent variables are correlated,
they are said to be collinear.
When more than two independent variables are
correlated, multicollinearity exists.
When multicollinearity is present, hypothesis
tests for the individual coefficients are not valid
but the model may still be useful.
62. Nonlinear Regression
In some situations, variables are not
linear.
Transformations may be used to turn
a nonlinear model into a linear model.
*
** *
***
*
*
Linear relationship
*
*
*
* **
*
* ** *
Nonlinear relationship
4-62
63. Colonel Motors
Engineers at Colonel Motors want to use
regression analysis to improve fuel efficiency.
They have been asked to study the impact of
weight on miles per gallon (MPG).
Table 4.6
MPG
12
13
15
18
19
19
WEIGHT
(1,000
LBS.)
4.58
4.66
4.02
2.53
3.09
3.11
MPG
20
23
24
33
36
42
WEIGHT
(1,000
LBS.)
3.18
2.68
2.65
1.70
1.95
1.92
4-63
65. Colonel Motors
Excel Output for Linear Regression
Model with MPG Data
Program 4.4
This is a useful model with a small F-test
for significance and a good r2 value.
4-65
67. Colonel Motors
The nonlinear model is a quadratic model.
The easiest way to work with this model is
to develop a new variable.
X 2 = ( weight)
2
This gives us a model that can be
solved with linear regression software:
ˆ
Y = b0 + b1 X 1 + b2 X 2
4-67
68. Colonel Motors
ˆ
Y = 79.8 − 30.2 X 1 + 3.4 X 2
Program 4.5
A better model with a smaller F-test for
significance and a larger adjusted r2 value
4-68
69. Cautions and Pitfalls
If the assumptions are not met, the
statistical test may not be valid.
Correlation does not necessarily mean
causation.
Multicollinearity makes interpreting
coefficients problematic, but the model may
still be good.
Using a regression model beyond the range
of X is questionable, as the relationship may
not hold outside the sample data.
70. Cautions and Pitfalls
A t-test for the intercept (b0) may be ignored
as this point is often outside the range of
the model.
A linear relationship may not be the best
relationship, even if the F-test returns an
acceptable value.
A nonlinear relationship can exist even if a
linear relationship does not.
Even though a relationship is statistically
significant it may not have any practical
value.