Single Linear Regression 
Conceptual Explanation
• Welcome to this explanation of Single Linear 
Regression.
• Welcome to this explanation of Single Linear 
Regression. 
• Single linear regression is an extension of 
correlation.
• Welcome to this explanation of Single Linear 
Regression. 
• Single linear regression is an extension of 
correlation. 
Correlation extends to Single Linear Regression
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
As one 
variable 
increases the 
other 
increases 
+.99
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
As one 
variable 
increases the 
other 
increases 
+.99 
This coefficient represents an 
almost perfect positive 
correlation or relationship 
between these two variables.
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases 
-.99
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases 
-.99 
Almost a perfect negative 
correlation or relationship 
between these two variables.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable. 
• For example:
• For example: 
If the following data set were real, what would you 
predict ice cream sales would be when the 
temperature reaches 1000?
• For example: 
If the following data set were real, what would you 
predict ice cream sales would be when the 
temperature reaches 1000? 
Ave Daily Ice Cream Sales 
? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500
• Single linear regression uses that information to 
predict the value of one variable (ice cream) based 
on the given value of the other variable 
(temperature).
• Single linear regression uses that information to 
predict the value of one variable (ice cream) based 
on the given value of the other variable 
(temperature).
If the following data set were real, what would you 
predict ice cream sales would be when the temperature 
reaches 1000? 
Ave Daily Ice Cream Sales 
630? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500 
• Rather than simply examining the relationship between 
the variables (as is the case with the Pearson Product 
Moment Correlation), one variable will be used as the 
predictor (temperature) and the other value will be 
used as the outcome or predicted (ice cream sales).
If the following data set were real, what would you 
predict ice cream sales would be when the temperature 
reaches 1000? 
Ave Daily Ice Cream Sales 
630? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500 
• Rather than simply examining the relationship between 
the variables (as is the case with the Pearson Product 
Moment Correlation), one variable will be used as the 
predictor (temperature) and the other value will be 
used as the outcome or predicted (ice cream sales). 
• Linear Regression makes it possible to estimate a value 
like 630
• In some cases which variable is considered 
predictor or outcome is arbitrary.
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26 
• It’s not clear which influences which. Most likely 
depression and anxiety mutually influence one 
another.
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome.
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs 
Levels of Vitamin E 
after two months 
10.3 units 
8.1 units 
7.3 units 
7.0 units 
6.8 units 
5.7 units
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs 
Levels of Vitamin E 
after two months 
10.3 units 
8.1 units 
7.3 units 
7.0 units 
6.8 units 
5.7 units 
In this example, 
exposure to sunlight 
may impact levels of 
Vitamin E. 
But, levels of Vitamin E 
would not impact the 
amount of sunlight 
one gets.
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space.
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space. 
Let’s plot the following data set:
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space. 
Let’s plot the following data set: 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26
• First, we assign the predictor variable along the X 
axis, which in this case we’ll arbitrarily say is 
depression.
• First, we assign the predictor variable along the X 
axis, which in this case we’ll arbitrarily say is 
depression. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• ... and the outcome variable along the Y axis we’ll 
arbitrarily say is Anxiety.
• ... and the outcome variable along the Y axis we’ll 
arbitrarily say is Anxiety. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety (33, 103) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anx(i2e6ty, 100) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression &( 2A2n,x 9ie2t)y 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Dep(r1e4s,s i7o4n) & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
(12, 52) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
(6, 26) 
0 10 20 30 40 
Anxiety 
Depression
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related.
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
In this case there is a 
strong tendency 
to relate and the 
relationship is 
positive
• With this data set the tendency for the variables to 
relate is strong and the direction is negative:
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Strong and Negative
• When no relationship exists the scatter plot tends 
to look like a big circle.
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26 
Weak and Positive
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26 
Weak and Negative
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other.
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other. 
Relationship between 
Depression & Anxiety 
Weak and Negative 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
120 
100 
80 
60 
40 
20 
0 
0 10 20 30 40 
Anxiety 
Depression 
Weak and Positive
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature).
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction. 
• One of the ways to represent those relationships is of 
course with the coefficients (e.g., +.99, +.14, -.03, -.99).
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction. 
• One of the ways to represent those relationships is of 
course with the coefficients (e.g., +.99, +.14, -.03, -.99). 
• Another way to represent it is by graphing the relationship.
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0).
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0). 
[Y= intercept + (slope ∙ X)]
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0). 
[Y= intercept + (slope ∙ X)] 
6 
5 
4 
3 
2 
1 
0 
0 1 2 3 4 5 6
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”.
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
• So the equation for this line so far would look like this:
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
• So the equation for this line so far would look like this: 
풚 = 0 + 
1 
1 
풙
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙 
This is where the 
line crosses the 
Y axis.
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙 
This is the slope 
which is the rise 
over the run.
• A line represents the functional relationship 
between variable X and variable Y, therefore, that 
line can be used to predict a Y value from any given 
X value.
• A line represents the functional relationship 
between variable X and variable Y, therefore, that 
line can be used to predict a Y value from any given 
X value. 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560
• In this case the two variables (temperature and ice 
cream sales) have a perfect linear relationship. This 
is rarely ever seen among variables such as these in 
the real world, but for illustrative purposes we have 
created a perfect relationship.
• In this case the two variables (temperature and ice 
cream sales) have a perfect linear relationship. This 
is rarely ever seen among variables such as these in 
the real world, but for illustrative purposes we have 
created a perfect relationship. 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
?
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction:
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 
• There are many ways to write this equation. Here is one 
way:
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 
• There are many ways to write this equation. Here is one 
way: 
푦 = 풃 +풎(푥
• Using this data set we can create a formula for a straight line 
that represents that relationship:
• Using this data set we can create a formula for a straight line 
that represents that relationship: 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560
• Using this data set we can create a formula for a straight line 
that represents that relationship: 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
700 
600 
500 
400 
300 
200 
100 
0 
푦= -162+8(푥) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be:
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 8(100) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 8(100) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 800 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales 
• This is a simple demonstration of how regression works.
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales 
• This is a simple demonstration of how regression works. 
• In reality, however, most variables will not correlate so 
perfectly like this did:
• Most will look like this:
• Most will look like this:
• Most will look like this: 
• This line is called the best fitting line because it minimizes 
the distance between the line and all of the points. You will 
notice again that we have a linear equation for that line:
• Most will look like this: 
• This line is called the best fitting line because it minimizes 
the distance between the line and all of the points. You will 
notice again that we have a linear equation for that line: 
푦= -50.93+7.21(x)
• Most will look like this: 
• This equation is calculated by using the standard 
deviations and means of the two variables. For brevity 
sake we will not go into this here. 
푦= -50.93+7.21(x)
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X.
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X.
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X. 
This line is the 
predicted values of Y 
calculated from the 
equation 
푦 = 푏 + 푚푥
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X. 
This line is the 
predicted values of Y 
calculated from the 
equation 
푦 = 푏 + 푚푥 
These dots represent the 
actual data
• We don’t have to actually plot the coordinates and lines. We 
can operate solely on the equations to generate predicted 
values and errors in prediction. In this way we can 
determine if temperature is a statistically significant 
predictor of ice cream sales.
• So here are the actual data we plotted the data from:
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation:
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation: 푦 = -50.93+7.21(x)
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation: 
푦 = -50.93+7.21(x) 
• Which is the equation for the best fitting line 
between these two variables:
• We can now plot the predicted Y using the equation:
• We can now plot the predicted Y using the equation: 
푦 = -50.93+7.21(x)
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x)
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x) 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) =
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x) 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
푦 = -50.93+7.21(x) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
• With this information we can now determine if x (temperature) is a 
statistically significant predictor of “y” (ice cream sales).
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year. 
• The mean is calculated by adding up the values and divided 
them by how many there are.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year. 
• The mean is calculated by adding up the values and divided 
them by how many there are. 
• (300+320+370+480+560+640+720+600+400+300+200+122) 
/ 12 = 417 average ice cream sales
• We then subtract each y value from the mean
• We then subtract each y value from the mean 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• We then subtract each y value from the mean 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
• Note - if we did not know the functional relationship 
between X and Y, our best prediction of any one person’s Y 
value would be the mean of Y.
• Because we are calculating the total sum of squares we will 
need to square the results and then take the average of the 
sum of squares. This is the same as the variance of all of the 
scores.
• Because we are calculating the total sum of squares we will 
need to square the results and then take the average of the 
sum of squares. This is the same as the variance of all of the 
scores. 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025
• Because we are calculating the total sum of squares we will 
need to square the results and then sum up the results
• Because we are calculating the total sum of squares we will 
need to square the results and then sum up the result 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
Sum up 
SUM 372844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
Sum of Squares df Mean Square F-ratio Significance 
Total 372,844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
• Now we will calculate the residual (error) and the 
regression sums of squares which will add up to 372,844. 
Sum of Squares df Mean Square F-ratio Significance 
Total 372,844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
• Now we will calculate the residual (error) and the 
regression sums of squares which will add up to 372,844. 
Sum of Squares df Mean Square F-ratio Significance 
Regression ? 
Residual (error) ? 
Total 372,844
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844.
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
-
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• The first data set are the actual Y values. We subtract them 
from the mean (417) which would be our best prediction if 
we did not know the relationship between X (temperature) 
and Y (ice cream sales)
• The first data set are the actual Y values. We subtract them 
from the mean (417) which would be our best prediction if 
we did not know the relationship between X (temperature) 
and Y (ice cream sales) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Here is the graphic depiction of our subtracting each data 
point from the mean (417):
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
200 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= -217 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
200 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= -217 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= +303 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= +303 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Now we have the difference between the actual values for 
Y (ice cream sales) and the mean of the values for Y (417) 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Now we have the difference between the actual values for 
Y (ice cream sales) and the mean of the values for Y (417) 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero.
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844 
• We are doing all this once again to show a 
visual depiction of what the total sums of 
squares are:
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844 
• We are doing all this once again to show a 
visual depiction of what the total sums of 
squares are: 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Total 372,844
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression.
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 
Total 372,844
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 
Total 372,844 
• Let’s calculate the error or residual sums of squares now.
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value.
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the actual 
Y values or average 
ice cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the actual 
Y values or average 
ice cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122
• Here are the predicted values using the linear 
regression formula:
• Here are the predicted values using the linear 
regression formula: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the 
actual Y values or 
average ice 
cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27
• Here are the predicted values using the linear 
regression formula: 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• Here are the predicted values using the linear 
regression formula: 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• From these points and the linear regression formula 
a line can be drawn
• From these points and the linear regression formula 
a line can be drawn 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• From these points and the linear regression formula 
a line can be drawn 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The difference between each actual value (orange) and the 
predicted value (green line) is what is called error or 
residual. The closer these two values are to each other the 
smaller the error. The farther these two values are from 
each other the larger the error and the weaker the 
predictive power of the regression line.
• The difference between each actual value (orange) and the 
predicted value (green line) is what is called error or 
residual. The closer these two values are to each other the 
smaller the error. The farther these two values are from 
each other the larger the error and the weaker the 
predictive power of the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
Difference 
Difference
• Let’s subtract the orange actual values and the green line 
predicted values:
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
+28.73 
122 
93
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
-125.87 
525 
400
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
average ice cream sales 
Midterm Exam 
Final Exam 
• And so on… 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
-125.87 
525 
400
• We then square those difference (deviations)
• We then square those difference (deviations) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
3910.00 
108.78 
136.19 
688.01 
1164.86 
1766.52 
2493.00 
4.12 
15843.26 
6669.99 
1404.00 
825.41
• We then square those difference (deviations) and sum them up 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
3910.00 
108.78 
136.19 
688.01 
1164.86 
1766.52 
2493.00 
4.12 
15843.26 
6669.99 
1404.00 
825.41 
Sum up 
= 35,014
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 35,014 
Total 372,844
• We will now calculate the regression sums of 
squares.
• We will now calculate the regression sums of 
squares. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 35,014 
Total 372,844 
• Our hope is that this value will be much bigger than 
the residual (35,014).
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean.
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line.
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
The blue line is 
the mean (417) 
which is the best 
predictor absent 
anything else.
• You can probably already tell that it will be bigger because a 
simple way to calculate it is to subtract the residual (35,014) 
from the total (372,844). 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• You can probably already tell that it will be bigger because a 
simple way to calculate it is to subtract the residual (35,014) 
from the total (372,844). 
• However, we will calculate it the long way so you can see what 
is happening. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• We subtract each predicted value from the mean of 
the actual Y values
• We subtract each predicted value from the mean of 
the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• We subtract each predicted value from the mean of 
the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
93 
- 417 
- 324
• We subtract each predicted value from the mean of 
the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
670 
- 417 
+252
• Then we square the differences (or deviations)
• Then we square the differences (or deviations) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
32470.8 
11684.9 
1295.76 
1303.45 
11708 
32509.3 
63707.4 
32509.3 
11708 
1295.76 
32470.8 
105233
• Then we square the differences (or deviations) and 
sum them up 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
32470.8 
11684.9 
1295.76 
1303.45 
11708 
32509.3 
63707.4 
32509.3 
11708 
1295.76 
32470.8 
105233 
Sum up 
= 337,830
• Then we square the differences (or deviations) and 
sum them up 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 
Residual 35,014 
Total 372,844
• Now we have all of the information to test for 
significance
• Now we have all of the information to test for 
significance 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 
Residual 35,014 
Total 372,844
• The degrees of freedom (df) for the regression are the 
number of parameters that are being estimated which 
in this case is the Y intercept and the slope in this 
equation minus
• The degrees of freedom (df) for the regression are the 
number of parameters that are being estimated which 
in this case is the Y intercept and the slope in this 
equation minus 
• 2 parameters -1 = 1 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 
Residual 35,014 
Total 372,844
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2)
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2) 
• 12 months – 2 parameters (slope / y intercept) = 10
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2) 
• 12 months – 2 parameters (slope / y intercept) = 10 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 
Residual 35,014 10 
Total 372,844
• We now have the information we need to calculate 
the Mean Square values. They are calculated by 
dividing the sums of squares by the degrees of 
freedom.
• We now have the information we need to calculate 
the Mean Square values. They are calculated by 
dividing the sums of squares by the degrees of 
freedom. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 =337,830 
Residual 35,014 10 =3,501 
Total 372,844
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square 
• 337,830 / 3,501 = 96.5
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square 
• 337,830 / 3,501 = 96.5 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
• With this information we can turn to the F-distribution 
table to determine the significance value.
• With this information we can turn to the F-distribution 
table to determine the significance value. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The regression degrees of freedom (1) is represented by the 
columns below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The regression degrees of freedom (1) is represented by the 
columns below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The residual degrees of freedom (10) is represented by the 
rows below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The residual degrees of freedom (10) is represented by the 
rows below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• Put them together and we have found the critical F value at the 
.05 alpha level to be 4.96.
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• Put them together and we have found the critical F value at the 
.05 alpha level to be 4.96.
• Because the F-ratio (96.5) exceeds the F-critical (4.96) 
we will reject the null hypothesis and indicate that 
temperature is a statistically significant predictor of ice 
cream sales
In Summary
In Summary 
• The whole point of this demonstration was to
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature)
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature) 
(2) show that the total variance in Y can be partitioned 
into regression (prediction power) and residual (error)
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature) 
(2) show that the total variance in Y can be partitioned 
into regression (prediction power) and residual (error) 
(3) show how this can be used to test whether the 
prediction is better than by chance.

Single linear regression

  • 1.
    Single Linear Regression Conceptual Explanation
  • 2.
    • Welcome tothis explanation of Single Linear Regression.
  • 3.
    • Welcome tothis explanation of Single Linear Regression. • Single linear regression is an extension of correlation.
  • 4.
    • Welcome tothis explanation of Single Linear Regression. • Single linear regression is an extension of correlation. Correlation extends to Single Linear Regression
  • 5.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables
  • 6.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables
  • 7.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables
  • 8.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables As one variable increases the other increases +.99
  • 9.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables As one variable increases the other increases +.99 This coefficient represents an almost perfect positive correlation or relationship between these two variables.
  • 10.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900
  • 11.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases
  • 12.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases -.99
  • 13.
    • Correlation isdesigned to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases -.99 Almost a perfect negative correlation or relationship between these two variables.
  • 14.
    • Single linearregression uses that information to predict the value of one variable based on the given value of the other variable.
  • 15.
    • Single linearregression uses that information to predict the value of one variable based on the given value of the other variable.
  • 16.
    • Single linearregression uses that information to predict the value of one variable based on the given value of the other variable. • For example:
  • 17.
    • For example: If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?
  • 18.
    • For example: If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales ? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500
  • 19.
    • Single linearregression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).
  • 20.
    • Single linearregression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).
  • 21.
    If the followingdata set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales 630? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500 • Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales).
  • 22.
    If the followingdata set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales 630? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500 • Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales). • Linear Regression makes it possible to estimate a value like 630
  • 23.
    • In somecases which variable is considered predictor or outcome is arbitrary.
  • 24.
    • In somecases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety
  • 25.
    • In somecases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26
  • 26.
    • In somecases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26 • It’s not clear which influences which. Most likely depression and anxiety mutually influence one another.
  • 27.
    • In somecases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.
  • 28.
    • In somecases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs
  • 29.
    • In somecases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs Levels of Vitamin E after two months 10.3 units 8.1 units 7.3 units 7.0 units 6.8 units 5.7 units
  • 30.
    • In somecases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs Levels of Vitamin E after two months 10.3 units 8.1 units 7.3 units 7.0 units 6.8 units 5.7 units In this example, exposure to sunlight may impact levels of Vitamin E. But, levels of Vitamin E would not impact the amount of sunlight one gets.
  • 31.
    • An easyway to conceptualize single linear regression is to create a scatterplot in Cartesian space.
  • 32.
    • An easyway to conceptualize single linear regression is to create a scatterplot in Cartesian space. Let’s plot the following data set:
  • 33.
    • An easyway to conceptualize single linear regression is to create a scatterplot in Cartesian space. Let’s plot the following data set: Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26
  • 34.
    • First, weassign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression.
  • 35.
    • First, weassign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 36.
    • ... andthe outcome variable along the Y axis we’ll arbitrarily say is Anxiety.
  • 37.
    • ... andthe outcome variable along the Y axis we’ll arbitrarily say is Anxiety. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 38.
    • Now, let’sidentify or plot each point or dot
  • 39.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 40.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 41.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (33, 103) 0 10 20 30 40 Anxiety Depression
  • 42.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 43.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anx(i2e6ty, 100) 0 10 20 30 40 Anxiety Depression
  • 44.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 45.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression &( 2A2n,x 9ie2t)y 0 10 20 30 40 Anxiety Depression
  • 46.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 47.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Dep(r1e4s,s i7o4n) & Anxiety 0 10 20 30 40 Anxiety Depression
  • 48.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 49.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (12, 52) 0 10 20 30 40 Anxiety Depression
  • 50.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 51.
    • Now, let’sidentify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (6, 26) 0 10 20 30 40 Anxiety Depression
  • 52.
    • Visually, onecan see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related.
  • 53.
    • Visually, onecan see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 54.
    • Visually, onecan see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression In this case there is a strong tendency to relate and the relationship is positive
  • 55.
    • With thisdata set the tendency for the variables to relate is strong and the direction is negative:
  • 56.
    • With thisdata set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26
  • 57.
    • With thisdata set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 58.
    • With thisdata set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Strong and Negative
  • 59.
    • When norelationship exists the scatter plot tends to look like a big circle.
  • 60.
    • When norelationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26
  • 61.
    • When norelationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 62.
    • When norelationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 63.
    • When norelationship exists the scatter plot tends to look like a big circle. Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26
  • 64.
    • When norelationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26
  • 65.
    • When norelationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26 Weak and Positive
  • 66.
    • When norelationship exists the scatter plot tends to look like a big circle. Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26
  • 67.
    • When norelationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26
  • 68.
    • When norelationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26 Weak and Negative
  • 69.
    • You mighthave noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other.
  • 70.
    • You mighthave noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 71.
    • You mighthave noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other. Relationship between Depression & Anxiety Weak and Negative 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression 120 100 80 60 40 20 0 0 10 20 30 40 Anxiety Depression Weak and Positive
  • 72.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature).
  • 73.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction.
  • 74.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.
  • 75.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.
  • 76.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction. • One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99).
  • 77.
    • As mentionedbefore, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction. • One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99). • Another way to represent it is by graphing the relationship.
  • 78.
    • Recall thata line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0).
  • 79.
    • Recall thata line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0). [Y= intercept + (slope ∙ X)]
  • 80.
    • Recall thata line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0). [Y= intercept + (slope ∙ X)] 6 5 4 3 2 1 0 0 1 2 3 4 5 6
  • 81.
    • In thiscase the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.
  • 82.
    • In thiscase the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6
  • 83.
    • In thiscase the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 • So the equation for this line so far would look like this:
  • 84.
    • In thiscase the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 • So the equation for this line so far would look like this: 풚 = 0 + 1 1 풙
  • 85.
    6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙
  • 86.
    6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙 This is where the line crosses the Y axis.
  • 87.
    6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙 This is the slope which is the rise over the run.
  • 88.
    • A linerepresents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value.
  • 89.
    • A linerepresents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value. Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560
  • 90.
    • In thiscase the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship.
  • 91.
    • In thiscase the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship. 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 92.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July
  • 93.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ?
  • 94.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:
  • 95.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥
  • 96.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 • There are many ways to write this equation. Here is one way:
  • 97.
    • Now let’ssay we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 • There are many ways to write this equation. Here is one way: 푦 = 풃 +풎(푥
  • 98.
    • Using thisdata set we can create a formula for a straight line that represents that relationship:
  • 99.
    • Using thisdata set we can create a formula for a straight line that represents that relationship: Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560
  • 100.
    • Using thisdata set we can create a formula for a straight line that represents that relationship: Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560 700 600 500 400 300 200 100 0 푦= -162+8(푥) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 101.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:
  • 102.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦
  • 103.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 8(100) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 104.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 8(100) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 105.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 800 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 106.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 107.
    • With thisequation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 108.
    • So, basedon our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 109.
    • So, basedon our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales • This is a simple demonstration of how regression works.
  • 110.
    • So, basedon our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales • This is a simple demonstration of how regression works. • In reality, however, most variables will not correlate so perfectly like this did:
  • 111.
    • Most willlook like this:
  • 112.
    • Most willlook like this:
  • 113.
    • Most willlook like this: • This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line:
  • 114.
    • Most willlook like this: • This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line: 푦= -50.93+7.21(x)
  • 115.
    • Most willlook like this: • This equation is calculated by using the standard deviations and means of the two variables. For brevity sake we will not go into this here. 푦= -50.93+7.21(x)
  • 116.
    • Given theinfinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.
  • 117.
    • Given theinfinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.
  • 118.
    • Given theinfinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X. This line is the predicted values of Y calculated from the equation 푦 = 푏 + 푚푥
  • 119.
    • Given theinfinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X. This line is the predicted values of Y calculated from the equation 푦 = 푏 + 푚푥 These dots represent the actual data
  • 120.
    • We don’thave to actually plot the coordinates and lines. We can operate solely on the equations to generate predicted values and errors in prediction. In this way we can determine if temperature is a statistically significant predictor of ice cream sales.
  • 121.
    • So hereare the actual data we plotted the data from:
  • 122.
    • So hereare the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122
  • 123.
    • So hereare the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122
  • 124.
    • So hereare the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation:
  • 125.
    • So hereare the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation: 푦 = -50.93+7.21(x)
  • 126.
    • So hereare the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation: 푦 = -50.93+7.21(x) • Which is the equation for the best fitting line between these two variables:
  • 127.
    • We cannow plot the predicted Y using the equation:
  • 128.
    • We cannow plot the predicted Y using the equation: 푦 = -50.93+7.21(x)
  • 129.
    • We cannow plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x)
  • 130.
    • We cannow plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x) 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) =
  • 131.
    • We cannow plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x) 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27
  • 132.
    • We cannow plot the predicted Y using the equation: (X) Ave Monthly Temp 푦 = -50.93+7.21(x) (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 • With this information we can now determine if x (temperature) is a statistically significant predictor of “y” (ice cream sales).
  • 133.
    • To beginwe need to determine the total sum of squares just like we would do with analysis of variance.
  • 134.
    • To beginwe need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year.
  • 135.
    • To beginwe need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year. • The mean is calculated by adding up the values and divided them by how many there are.
  • 136.
    • To beginwe need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year. • The mean is calculated by adding up the values and divided them by how many there are. • (300+320+370+480+560+640+720+600+400+300+200+122) / 12 = 417 average ice cream sales
  • 137.
    • We thensubtract each y value from the mean
  • 138.
    • We thensubtract each y value from the mean (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 139.
    • We thensubtract each y value from the mean (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = • Note - if we did not know the functional relationship between X and Y, our best prediction of any one person’s Y value would be the mean of Y.
  • 140.
    • Because weare calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores.
  • 141.
    • Because weare calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores. (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025
  • 142.
    • Because weare calculating the total sum of squares we will need to square the results and then sum up the results
  • 143.
    • Because weare calculating the total sum of squares we will need to square the results and then sum up the result (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 Sum up SUM 372844
  • 144.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.
  • 145.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater.
  • 146.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070.
  • 147.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 148.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 149.
    • Now wefind regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio Significance Regression ? Residual (error) ? Total 372,844
  • 150.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844.
  • 151.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values
  • 152.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - -
  • 153.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 154.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 155.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 156.
    • Before wecalculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 157.
    • The firstdata set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales)
  • 158.
    • The firstdata set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales) (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 159.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417):
  • 160.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417
  • 161.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 162.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 163.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 200 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = -217 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 164.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 200 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = -217 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 165.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = +303 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 166.
    • Here isthe graphic depiction of our subtracting each data point from the mean (417): 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = +303 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 167.
    • Now wehave the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417) 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 168.
    • Now wehave the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417) 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 169.
    • As weshowed previously we have to square this value because if we don’t when we sum the differences they will come to zero.
  • 170.
    • As weshowed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844
  • 171.
    • As weshowed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844 • We are doing all this once again to show a visual depiction of what the total sums of squares are:
  • 172.
    • As weshowed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844 • We are doing all this once again to show a visual depiction of what the total sums of squares are: Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 173.
    • Now thatwe’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression.
  • 174.
    • Now thatwe’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression. Sum of Squares df Mean Square F-ratio Significance Regression Residual Total 372,844
  • 175.
    • Now thatwe’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression. Sum of Squares df Mean Square F-ratio Significance Regression Residual Total 372,844 • Let’s calculate the error or residual sums of squares now.
  • 176.
    • The erroror residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.
  • 177.
    • The erroror residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values
  • 178.
    • The erroror residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 179.
    • The erroror residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122
  • 180.
    • Here arethe predicted values using the linear regression formula:
  • 181.
    • Here arethe predicted values using the linear regression formula: 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27
  • 182.
    • Here arethe predicted values using the linear regression formula: (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 183.
    • Here arethe predicted values using the linear regression formula: (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 184.
    • From thesepoints and the linear regression formula a line can be drawn
  • 185.
    • From thesepoints and the linear regression formula a line can be drawn 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 186.
    • From thesepoints and the linear regression formula a line can be drawn 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 187.
    • The differencebetween each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line.
  • 188.
    • The differencebetween each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales Difference Difference
  • 189.
    • Let’s subtractthe orange actual values and the green line predicted values:
  • 190.
    • Let’s subtractthe orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = +28.73 122 93
  • 191.
    • Let’s subtractthe orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = -125.87 525 400
  • 192.
    • Let’s subtractthe orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 average ice cream sales Midterm Exam Final Exam • And so on… (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = -125.87 525 400
  • 193.
    • We thensquare those difference (deviations)
  • 194.
    • We thensquare those difference (deviations) (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = Squared 3910.00 108.78 136.19 688.01 1164.86 1766.52 2493.00 4.12 15843.26 6669.99 1404.00 825.41
  • 195.
    • We thensquare those difference (deviations) and sum them up (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = Squared 3910.00 108.78 136.19 688.01 1164.86 1766.52 2493.00 4.12 15843.26 6669.99 1404.00 825.41 Sum up = 35,014
  • 196.
    Sum of Squares df Mean Square F-ratio Significance Regression Residual 35,014 Total 372,844
  • 197.
    • We willnow calculate the regression sums of squares.
  • 198.
    • We willnow calculate the regression sums of squares. Sum of Squares df Mean Square F-ratio Significance Regression Residual 35,014 Total 372,844 • Our hope is that this value will be much bigger than the residual (35,014).
  • 199.
    • The regressionsums of squares is calculated by subtracting the predicted values from the mean.
  • 200.
    • The regressionsums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line.
  • 201.
    • The regressionsums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 202.
    • The regressionsums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 203.
    • The regressionsums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales The blue line is the mean (417) which is the best predictor absent anything else.
  • 204.
    • You canprobably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844). 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 205.
    • You canprobably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844). • However, we will calculate it the long way so you can see what is happening. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 206.
    • We subtracteach predicted value from the mean of the actual Y values
  • 207.
    • We subtracteach predicted value from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 208.
    • We subtracteach predicted value from the mean of the actual Y values 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 93 - 417 - 324
  • 209.
    • We subtracteach predicted value from the mean of the actual Y values 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 670 - 417 +252
  • 210.
    • Then wesquare the differences (or deviations)
  • 211.
    • Then wesquare the differences (or deviations) (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = Squared 32470.8 11684.9 1295.76 1303.45 11708 32509.3 63707.4 32509.3 11708 1295.76 32470.8 105233
  • 212.
    • Then wesquare the differences (or deviations) and sum them up (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = Squared 32470.8 11684.9 1295.76 1303.45 11708 32509.3 63707.4 32509.3 11708 1295.76 32470.8 105233 Sum up = 337,830
  • 213.
    • Then wesquare the differences (or deviations) and sum them up Sum of Squares df Mean Square F-ratio Significance Regression 337,830 Residual 35,014 Total 372,844
  • 214.
    • Now wehave all of the information to test for significance
  • 215.
    • Now wehave all of the information to test for significance Sum of Squares df Mean Square F-ratio Significance Regression 337,830 Residual 35,014 Total 372,844
  • 216.
    • The degreesof freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus
  • 217.
    • The degreesof freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus • 2 parameters -1 = 1 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 Residual 35,014 Total 372,844
  • 218.
    • The degreesof freedom for residual is the number of cases (12) minus the number of parameters (2)
  • 219.
    • The degreesof freedom for residual is the number of cases (12) minus the number of parameters (2) • 12 months – 2 parameters (slope / y intercept) = 10
  • 220.
    • The degreesof freedom for residual is the number of cases (12) minus the number of parameters (2) • 12 months – 2 parameters (slope / y intercept) = 10 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 Residual 35,014 10 Total 372,844
  • 221.
    • We nowhave the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom.
  • 222.
    • We nowhave the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 =337,830 Residual 35,014 10 =3,501 Total 372,844
  • 223.
    • The F-ratiois computed by dividing the Regression Mean Square by the Residual Mean Square
  • 224.
    • The F-ratiois computed by dividing the Regression Mean Square by the Residual Mean Square • 337,830 / 3,501 = 96.5
  • 225.
    • The F-ratiois computed by dividing the Regression Mean Square by the Residual Mean Square • 337,830 / 3,501 = 96.5 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 226.
    • With thisinformation we can turn to the F-distribution table to determine the significance value.
  • 227.
    • With thisinformation we can turn to the F-distribution table to determine the significance value. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 228.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 229.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The regression degrees of freedom (1) is represented by the columns below:
  • 230.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The regression degrees of freedom (1) is represented by the columns below:
  • 231.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 232.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The residual degrees of freedom (10) is represented by the rows below:
  • 233.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The residual degrees of freedom (10) is represented by the rows below:
  • 234.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 235.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • Put them together and we have found the critical F value at the .05 alpha level to be 4.96.
  • 236.
    Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • Put them together and we have found the critical F value at the .05 alpha level to be 4.96.
  • 237.
    • Because theF-ratio (96.5) exceeds the F-critical (4.96) we will reject the null hypothesis and indicate that temperature is a statistically significant predictor of ice cream sales
  • 238.
  • 239.
    In Summary •The whole point of this demonstration was to
  • 240.
    In Summary •The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature)
  • 241.
    In Summary •The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature) (2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error)
  • 242.
    In Summary •The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature) (2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error) (3) show how this can be used to test whether the prediction is better than by chance.