SlideShare a Scribd company logo
Single Linear Regression 
Conceptual Explanation
• Welcome to this explanation of Single Linear 
Regression.
• Welcome to this explanation of Single Linear 
Regression. 
• Single linear regression is an extension of 
correlation.
• Welcome to this explanation of Single Linear 
Regression. 
• Single linear regression is an extension of 
correlation. 
Correlation extends to Single Linear Regression
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
As one 
variable 
increases the 
other 
increases 
+.99
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
As one 
variable 
increases the 
other 
increases 
+.99 
This coefficient represents an 
almost perfect positive 
correlation or relationship 
between these two variables.
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases 
-.99
• Correlation is designed to render a single 
coefficient that represents the degree of coherence 
between two variables 
Ave Daily Temp 
500 
600 
700 
800 
900 
As one 
variable 
decreases the 
other 
increases 
-.99 
Almost a perfect negative 
correlation or relationship 
between these two variables.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable.
• Single linear regression uses that information to 
predict the value of one variable based on the 
given value of the other variable. 
• For example:
• For example: 
If the following data set were real, what would you 
predict ice cream sales would be when the 
temperature reaches 1000?
• For example: 
If the following data set were real, what would you 
predict ice cream sales would be when the 
temperature reaches 1000? 
Ave Daily Ice Cream Sales 
? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500
• Single linear regression uses that information to 
predict the value of one variable (ice cream) based 
on the given value of the other variable 
(temperature).
• Single linear regression uses that information to 
predict the value of one variable (ice cream) based 
on the given value of the other variable 
(temperature).
If the following data set were real, what would you 
predict ice cream sales would be when the temperature 
reaches 1000? 
Ave Daily Ice Cream Sales 
630? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500 
• Rather than simply examining the relationship between 
the variables (as is the case with the Pearson Product 
Moment Correlation), one variable will be used as the 
predictor (temperature) and the other value will be 
used as the outcome or predicted (ice cream sales).
If the following data set were real, what would you 
predict ice cream sales would be when the temperature 
reaches 1000? 
Ave Daily Ice Cream Sales 
630? 
560 
480 
350 
320 
230 
Ave Daily Temp 
1000 
900 
800 
700 
600 
500 
• Rather than simply examining the relationship between 
the variables (as is the case with the Pearson Product 
Moment Correlation), one variable will be used as the 
predictor (temperature) and the other value will be 
used as the outcome or predicted (ice cream sales). 
• Linear Regression makes it possible to estimate a value 
like 630
• In some cases which variable is considered 
predictor or outcome is arbitrary.
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26
• In some cases which variable is considered 
predictor or outcome is arbitrary. 
• Like measures of depression and anxiety 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26 
• It’s not clear which influences which. Most likely 
depression and anxiety mutually influence one 
another.
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome.
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs 
Levels of Vitamin E 
after two months 
10.3 units 
8.1 units 
7.3 units 
7.0 units 
6.8 units 
5.7 units
• In some cases, either by theory or by the nature of 
the research design, one variable will be rationally 
defined as the predictor and the other as the 
outcome. 
Ave Daily 
Exposure to Sunlight 
3.3 hrs 
2.6 hrs 
2.2 hrs 
1.4 hrs 
1.2 hrs 
0.6 hrs 
Levels of Vitamin E 
after two months 
10.3 units 
8.1 units 
7.3 units 
7.0 units 
6.8 units 
5.7 units 
In this example, 
exposure to sunlight 
may impact levels of 
Vitamin E. 
But, levels of Vitamin E 
would not impact the 
amount of sunlight 
one gets.
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space.
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space. 
Let’s plot the following data set:
• An easy way to conceptualize single linear 
regression is to create a scatterplot in Cartesian 
space. 
Let’s plot the following data set: 
Composite 
Depression Score 
33 
26 
22 
14 
12 
6 
Composite 
Anxiety Score 
103 
100 
92 
74 
52 
26
• First, we assign the predictor variable along the X 
axis, which in this case we’ll arbitrarily say is 
depression.
• First, we assign the predictor variable along the X 
axis, which in this case we’ll arbitrarily say is 
depression. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• ... and the outcome variable along the Y axis we’ll 
arbitrarily say is Anxiety.
• ... and the outcome variable along the Y axis we’ll 
arbitrarily say is Anxiety. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety (33, 103) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anx(i2e6ty, 100) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression &( 2A2n,x 9ie2t)y 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Dep(r1e4s,s i7o4n) & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
(12, 52) 
0 10 20 30 40 
Anxiety 
Depression
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26
• Now, let’s identify or plot each point or dot 
Depression 
33 
26 
22 
14 
12 
6 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
(6, 26) 
0 10 20 30 40 
Anxiety 
Depression
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related.
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• Visually, one can see in the plotted space whether 
there is a tendency for the variables to be related 
and in what direction they are related. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
In this case there is a 
strong tendency 
to relate and the 
relationship is 
positive
• With this data set the tendency for the variables to 
relate is strong and the direction is negative:
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• With this data set the tendency for the variables to 
relate is strong and the direction is negative: 
Depression 
6 
12 
14 
22 
26 
33 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Strong and Negative
• When no relationship exists the scatter plot tends 
to look like a big circle.
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
33 
12 
6 
14 
26 
Anxiety 
103 
100 
92 
74 
52 
26 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
22 
6 
33 
26 
14 
12 
Anxiety 
103 
100 
92 
74 
52 
26 
Weak and Positive
• When no relationship exists the scatter plot tends 
to look like a big circle. 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26
• When no relationship exists the scatter plot tends 
to look like a big circle. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
Depression 
6 
14 
33 
26 
12 
22 
Anxiety 
103 
100 
74 
92 
52 
26 
Weak and Negative
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other.
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other. 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression
• You might have noticed that as the variables are related 
either positively or negatively, the plot looks more like an 
oval tilted one way or the other. 
Relationship between 
Depression & Anxiety 
Weak and Negative 
120 
100 
80 
60 
40 
20 
0 
Relationship between 
Depression & Anxiety 
0 10 20 30 40 
Anxiety 
Depression 
120 
100 
80 
60 
40 
20 
0 
0 10 20 30 40 
Anxiety 
Depression 
Weak and Positive
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature).
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction.
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction. 
• One of the ways to represent those relationships is of 
course with the coefficients (e.g., +.99, +.14, -.03, -.99).
• As mentioned before, Linear Regression is used to predict 
one variable (ice cream sales) from another related variable 
(temperature). 
• The stronger the relationship (e.g., +.99 or -.99) the more 
accurate the prediction. 
• The weaker the relationship (e.g., +.14 or -.03) the less 
accurate the prediction. 
• One of the ways to represent those relationships is of 
course with the coefficients (e.g., +.99, +.14, -.03, -.99). 
• Another way to represent it is by graphing the relationship.
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0).
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0). 
[Y= intercept + (slope ∙ X)]
• Recall that a line in Cartesian space is defined by its 
slope and its Y intercept (the value of Y when X 
equals 0). 
[Y= intercept + (slope ∙ X)] 
6 
5 
4 
3 
2 
1 
0 
0 1 2 3 4 5 6
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”.
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
• So the equation for this line so far would look like this:
• In this case the slope would be 1. You may 
remember that this value is derived by taking what 
is called the “rise” over the “run”. 
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
• So the equation for this line so far would look like this: 
풚 = 0 + 
1 
1 
풙
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙 
This is where the 
line crosses the 
Y axis.
6 
5 
4 
3 
2 
1 
0 
rise 
1 
run 
1 
0 1 2 3 4 5 6 
풚 = 0 + 
1 
1 
풙 
This is the slope 
which is the rise 
over the run.
• A line represents the functional relationship 
between variable X and variable Y, therefore, that 
line can be used to predict a Y value from any given 
X value.
• A line represents the functional relationship 
between variable X and variable Y, therefore, that 
line can be used to predict a Y value from any given 
X value. 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560
• In this case the two variables (temperature and ice 
cream sales) have a perfect linear relationship. This 
is rarely ever seen among variables such as these in 
the real world, but for illustrative purposes we have 
created a perfect relationship.
• In this case the two variables (temperature and ice 
cream sales) have a perfect linear relationship. This 
is rarely ever seen among variables such as these in 
the real world, but for illustrative purposes we have 
created a perfect relationship. 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
?
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction:
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 
• There are many ways to write this equation. Here is one 
way:
• Now let’s say we have data for the average temperature 
during the month of July. But, we don’t have the data for 
the average ice cream sales for July 
Feb 
Mar 
Apr 
May 
Jun 
JUL 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
? 
• Using single linear regression we can predict the average ice 
cream sales for July. Here is the formula we will use for the 
prediction: 
푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 
• There are many ways to write this equation. Here is one 
way: 
푦 = 풃 +풎(푥
• Using this data set we can create a formula for a straight line 
that represents that relationship:
• Using this data set we can create a formula for a straight line 
that represents that relationship: 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560
• Using this data set we can create a formula for a straight line 
that represents that relationship: 
Feb 
Mar 
Apr 
May 
Jun 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
700 
600 
500 
400 
300 
200 
100 
0 
푦= -162+8(푥) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be:
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 8(100) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 8(100) 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = -162 + 800 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
푦 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• With this equation we can now plug in the average 
temperature for July (1000) and see what the predicted 
average ice cream sales would be: 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales 
• This is a simple demonstration of how regression works.
• So, based on our single linear regression analysis we would 
predict that in the month of July that the average monthly 
ice cream sales will be 638. 
Feb 
Mar 
Apr 
May 
Jun 
Jul 
Ave Monthly 
Temperature 
500 
600 
700 
800 
900 
1000 
Ave Monthly Ice 
Cream Sales 
239 
320 
400 
480 
560 
638 
700 
600 
500 
400 
300 
200 
100 
0 
푦̂ = 638 
0 20 40 60 80 100 120 
Ave Monthly Temperature 
Average Monthly Ice Cream Sales 
• This is a simple demonstration of how regression works. 
• In reality, however, most variables will not correlate so 
perfectly like this did:
• Most will look like this:
• Most will look like this:
• Most will look like this: 
• This line is called the best fitting line because it minimizes 
the distance between the line and all of the points. You will 
notice again that we have a linear equation for that line:
• Most will look like this: 
• This line is called the best fitting line because it minimizes 
the distance between the line and all of the points. You will 
notice again that we have a linear equation for that line: 
푦= -50.93+7.21(x)
• Most will look like this: 
• This equation is calculated by using the standard 
deviations and means of the two variables. For brevity 
sake we will not go into this here. 
푦= -50.93+7.21(x)
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X.
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X.
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X. 
This line is the 
predicted values of Y 
calculated from the 
equation 
푦 = 푏 + 푚푥
• Given the infinite number of positive linear fitting through a 
scatterplot, the one closer to represent the functional 
relationship between X and Y is the line that results in the 
cumulative least squared error between the predicted values 
of Y and the true observed values of Y for each given X. 
This line is the 
predicted values of Y 
calculated from the 
equation 
푦 = 푏 + 푚푥 
These dots represent the 
actual data
• We don’t have to actually plot the coordinates and lines. We 
can operate solely on the equations to generate predicted 
values and errors in prediction. In this way we can 
determine if temperature is a statistically significant 
predictor of ice cream sales.
• So here are the actual data we plotted the data from:
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation:
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation: 푦 = -50.93+7.21(x)
• So here are the actual data we plotted the data from: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
• We can now plot the predicted Y using the 
equation: 
푦 = -50.93+7.21(x) 
• Which is the equation for the best fitting line 
between these two variables:
• We can now plot the predicted Y using the equation:
• We can now plot the predicted Y using the equation: 
푦 = -50.93+7.21(x)
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x)
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x) 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) =
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(x) 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27
• We can now plot the predicted Y using the equation: 
(X) Ave 
Monthly 
Temp 
푦 = -50.93+7.21(x) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
Jan 40 300 
Feb 50 320 
Mar 60 370 
Apr 70 480 
May 80 560 
Jun 90 640 
Jul 100 720 
Aug 90 600 
Sep 80 400 
Oct 60 300 
Nov 40 200 
Dec 20 122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
• With this information we can now determine if x (temperature) is a 
statistically significant predictor of “y” (ice cream sales).
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year. 
• The mean is calculated by adding up the values and divided 
them by how many there are.
• To begin we need to determine the total sum of squares just 
like we would do with analysis of variance. 
• This is done by subtracting the actual “Y” (ice cream sales) 
values from the average or mean ice cream sales for the 
whole year. 
• The mean is calculated by adding up the values and divided 
them by how many there are. 
• (300+320+370+480+560+640+720+600+400+300+200+122) 
/ 12 = 417 average ice cream sales
• We then subtract each y value from the mean
• We then subtract each y value from the mean 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• We then subtract each y value from the mean 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
• Note - if we did not know the functional relationship 
between X and Y, our best prediction of any one person’s Y 
value would be the mean of Y.
• Because we are calculating the total sum of squares we will 
need to square the results and then take the average of the 
sum of squares. This is the same as the variance of all of the 
scores.
• Because we are calculating the total sum of squares we will 
need to square the results and then take the average of the 
sum of squares. This is the same as the variance of all of the 
scores. 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025
• Because we are calculating the total sum of squares we will 
need to square the results and then sum up the results
• Because we are calculating the total sum of squares we will 
need to square the results and then sum up the result 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
Sum up 
SUM 372844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070.
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
Sum of Squares df Mean Square F-ratio Significance 
Total 372,844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
• Now we will calculate the residual (error) and the 
regression sums of squares which will add up to 372,844. 
Sum of Squares df Mean Square F-ratio Significance 
Total 372,844
• Now we find regression (good) and residual (bad). To have 
better prediction power we want the regression sums of 
squares to be large and the residual or error sums of squares 
to be small. 
• Let’s see if the residual or the regression is greater. 
• We know that the total sums of squares is 31,070. 
• Now we will calculate the residual (error) and the 
regression sums of squares which will add up to 372,844. 
Sum of Squares df Mean Square F-ratio Significance 
Regression ? 
Residual (error) ? 
Total 372,844
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844.
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
-
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Before we calculate residual and regression let’s see 
visually how we calculated the total sums of squares - 
372,844. 
• Once again we subtract the actual Y values from the mean 
of the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• The first data set are the actual Y values. We subtract them 
from the mean (417) which would be our best prediction if 
we did not know the relationship between X (temperature) 
and Y (ice cream sales)
• The first data set are the actual Y values. We subtract them 
from the mean (417) which would be our best prediction if 
we did not know the relationship between X (temperature) 
and Y (ice cream sales) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam
• Here is the graphic depiction of our subtracting each data 
point from the mean (417):
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
122 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
122 
-417 
= -295 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
200 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= -217 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
200 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= -217 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= +303 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Here is the graphic depiction of our subtracting each data 
point from the mean (417): 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
200 
-417 
= +303 
417 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Now we have the difference between the actual values for 
Y (ice cream sales) and the mean of the values for Y (417) 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• Now we have the difference between the actual values for 
Y (ice cream sales) and the mean of the values for Y (417) 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
417 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
=
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero.
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844 
• We are doing all this once again to show a 
visual depiction of what the total sums of 
squares are:
• As we showed previously we have to square this value 
because if we don’t when we sum the differences they will 
come to zero. 
Difference 
-117 
-97 
-47 
63 
143 
223 
303 
183 
-17 
-117 
-217 
-295 
Squared 
13689 
9409 
2209 
3969 
20449 
49729 
91809 
33489 
289 
13689 
47089 
87025 
SUM 
= 0 
SUM 
= 372,844 
• We are doing all this once again to show a 
visual depiction of what the total sums of 
squares are: 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Total 372,844
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression.
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 
Total 372,844
• Now that we’ve seen a visual depiction of how we 
calculated total sums of squares we compare the sums of 
squares that are associated with error (residual) and those 
associated with regression. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 
Total 372,844 
• Let’s calculate the error or residual sums of squares now.
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value.
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the actual 
Y values or average 
ice cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The error or residual sums of squares are 
computed by subtracting each actual Y value from 
each Y predicted value. 
• Here are the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the actual 
Y values or average 
ice cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122
• Here are the predicted values using the linear 
regression formula:
• Here are the predicted values using the linear 
regression formula: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
These are the 
actual Y values or 
average ice 
cream sales 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27
• Here are the predicted values using the linear 
regression formula: 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• Here are the predicted values using the linear 
regression formula: 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(320) = 
푦 = -50.93+7.21(370) = 
푦 = -50.93+7.21(480) = 
푦 = -50.93+7.21(560) = 
푦 = -50.93+7.21(640) = 
푦 = -50.93+7.21(720) = 
푦 = -50.93+7.21(600) = 
푦 = -50.93+7.21(400) = 
푦 = -50.93+7.21(300) = 
푦 = -50.93+7.21(200) = 
푦 = -50.93+7.21(122) = 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• From these points and the linear regression formula 
a line can be drawn
• From these points and the linear regression formula 
a line can be drawn 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• From these points and the linear regression formula 
a line can be drawn 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The difference between each actual value (orange) and the 
predicted value (green line) is what is called error or 
residual. The closer these two values are to each other the 
smaller the error. The farther these two values are from 
each other the larger the error and the weaker the 
predictive power of the regression line.
• The difference between each actual value (orange) and the 
predicted value (green line) is what is called error or 
residual. The closer these two values are to each other the 
smaller the error. The farther these two values are from 
each other the larger the error and the weaker the 
predictive power of the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
Difference 
Difference
• Let’s subtract the orange actual values and the green line 
predicted values:
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
+28.73 
122 
93
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
-125.87 
525 
400
• Let’s subtract the orange actual values and the green line 
predicted values: 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
average ice cream sales 
Midterm Exam 
Final Exam 
• And so on… 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
-125.87 
525 
400
• We then square those difference (deviations)
• We then square those difference (deviations) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
3910.00 
108.78 
136.19 
688.01 
1164.86 
1766.52 
2493.00 
4.12 
15843.26 
6669.99 
1404.00 
825.41
• We then square those difference (deviations) and sum them up 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
300 
320 
370 
480 
560 
640 
720 
600 
400 
300 
200 
122 
(푦 Predicted Ave 
Monthly Ice Cream 
Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Difference 
62.53 
10.43 
-11.67 
26.23 
34.13 
42.03 
49.93 
2.03 
-125.87 
-81.67 
-37.47 
28.73 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
3910.00 
108.78 
136.19 
688.01 
1164.86 
1766.52 
2493.00 
4.12 
15843.26 
6669.99 
1404.00 
825.41 
Sum up 
= 35,014
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 35,014 
Total 372,844
• We will now calculate the regression sums of 
squares.
• We will now calculate the regression sums of 
squares. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 
Residual 35,014 
Total 372,844 
• Our hope is that this value will be much bigger than 
the residual (35,014).
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean.
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line.
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• The regression sums of squares is calculated by subtracting 
the predicted values from the mean. 
• Let’s see what this looks like visually. The green line is the 
predicted values for Y or the regression line. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
The blue line is 
the mean (417) 
which is the best 
predictor absent 
anything else.
• You can probably already tell that it will be bigger because a 
simple way to calculate it is to subtract the residual (35,014) 
from the total (372,844). 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• You can probably already tell that it will be bigger because a 
simple way to calculate it is to subtract the residual (35,014) 
from the total (372,844). 
• However, we will calculate it the long way so you can see what 
is happening. 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• We subtract each predicted value from the mean of 
the actual Y values
• We subtract each predicted value from the mean of 
the actual Y values 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales
• We subtract each predicted value from the mean of 
the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
93 
- 417 
- 324
• We subtract each predicted value from the mean of 
the actual Y values 
800 
700 
600 
500 
400 
300 
200 
100 
0 
0 20 40 60 80 100 120 
Midterm Exam 
Final Exam 
average ice cream sales 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
670 
- 417 
+252
• Then we square the differences (or deviations)
• Then we square the differences (or deviations) 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
32470.8 
11684.9 
1295.76 
1303.45 
11708 
32509.3 
63707.4 
32509.3 
11708 
1295.76 
32470.8 
105233
• Then we square the differences (or deviations) and 
sum them up 
(y) Actual Ave 
Monthly Ice 
Cream Sales 
237.47 
309.57 
381.67 
453.77 
525.87 
597.97 
670.07 
597.97 
525.87 
381.67 
237.47 
93.27 
Mean Monthly Ice 
Cream Sales 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
417.7 
Difference 
-180.2 
-108.1 
-36.0 
36.1 
108.2 
180.3 
252.4 
180.3 
108.2 
-36.0 
-180.2 
-324.4 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
- 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
Squared 
32470.8 
11684.9 
1295.76 
1303.45 
11708 
32509.3 
63707.4 
32509.3 
11708 
1295.76 
32470.8 
105233 
Sum up 
= 337,830
• Then we square the differences (or deviations) and 
sum them up 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 
Residual 35,014 
Total 372,844
• Now we have all of the information to test for 
significance
• Now we have all of the information to test for 
significance 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 
Residual 35,014 
Total 372,844
• The degrees of freedom (df) for the regression are the 
number of parameters that are being estimated which 
in this case is the Y intercept and the slope in this 
equation minus
• The degrees of freedom (df) for the regression are the 
number of parameters that are being estimated which 
in this case is the Y intercept and the slope in this 
equation minus 
• 2 parameters -1 = 1 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 
Residual 35,014 
Total 372,844
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2)
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2) 
• 12 months – 2 parameters (slope / y intercept) = 10
• The degrees of freedom for residual is the number of 
cases (12) minus the number of parameters (2) 
• 12 months – 2 parameters (slope / y intercept) = 10 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 
Residual 35,014 10 
Total 372,844
• We now have the information we need to calculate 
the Mean Square values. They are calculated by 
dividing the sums of squares by the degrees of 
freedom.
• We now have the information we need to calculate 
the Mean Square values. They are calculated by 
dividing the sums of squares by the degrees of 
freedom. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 =337,830 
Residual 35,014 10 =3,501 
Total 372,844
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square 
• 337,830 / 3,501 = 96.5
• The F-ratio is computed by dividing the Regression 
Mean Square by the Residual Mean Square 
• 337,830 / 3,501 = 96.5 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
• With this information we can turn to the F-distribution 
table to determine the significance value.
• With this information we can turn to the F-distribution 
table to determine the significance value. 
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The regression degrees of freedom (1) is represented by the 
columns below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The regression degrees of freedom (1) is represented by the 
columns below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The residual degrees of freedom (10) is represented by the 
rows below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• The residual degrees of freedom (10) is represented by the 
rows below:
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• Put them together and we have found the critical F value at the 
.05 alpha level to be 4.96.
Sum of 
Squares 
df Mean Square F-ratio Significance 
Regression 337,830 1 337,830 96.5 
Residual 35,014 10 3,501 
Total 372,844 
• Put them together and we have found the critical F value at the 
.05 alpha level to be 4.96.
• Because the F-ratio (96.5) exceeds the F-critical (4.96) 
we will reject the null hypothesis and indicate that 
temperature is a statistically significant predictor of ice 
cream sales
In Summary
In Summary 
• The whole point of this demonstration was to
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature)
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature) 
(2) show that the total variance in Y can be partitioned 
into regression (prediction power) and residual (error)
In Summary 
• The whole point of this demonstration was to 
(1) explain that linear regression is used to predict the 
value of one variable (ice cream sales) based on another 
variable (temperature) 
(2) show that the total variance in Y can be partitioned 
into regression (prediction power) and residual (error) 
(3) show how this can be used to test whether the 
prediction is better than by chance.

More Related Content

What's hot

Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
James Neill
 
Statistics For Data Science | Statistics Using R Programming Language | Hypot...
Statistics For Data Science | Statistics Using R Programming Language | Hypot...Statistics For Data Science | Statistics Using R Programming Language | Hypot...
Statistics For Data Science | Statistics Using R Programming Language | Hypot...
Edureka!
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
University of Southampton
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
Khaled Abd Elaziz
 
Linear regression
Linear regressionLinear regression
Linear regression
vermaumeshverma
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
Avjinder (Avi) Kaler
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
Rabin BK
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regression
A M
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
Dr.ammara khakwani
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
LNIPE
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
zekeLabs Technologies
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
Andrew Ferlitsch
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationships
Anirudha si
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Derek Kane
 
Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in ML
Kumud Arora
 
Probability And Probability Distributions
Probability And Probability Distributions Probability And Probability Distributions
Probability And Probability Distributions
Sahil Nagpal
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability Distributions
CIToolkit
 

What's hot (20)

Multiple regression
Multiple regressionMultiple regression
Multiple regression
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Statistics For Data Science | Statistics Using R Programming Language | Hypot...
Statistics For Data Science | Statistics Using R Programming Language | Hypot...Statistics For Data Science | Statistics Using R Programming Language | Hypot...
Statistics For Data Science | Statistics Using R Programming Language | Hypot...
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
4.5. logistic regression
4.5. logistic regression4.5. logistic regression
4.5. logistic regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
Multinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationshipsMultinomial logisticregression basicrelationships
Multinomial logisticregression basicrelationships
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Linear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in MLLinear Regression and Logistic Regression in ML
Linear Regression and Logistic Regression in ML
 
Probability And Probability Distributions
Probability And Probability Distributions Probability And Probability Distributions
Probability And Probability Distributions
 
Correlation and Simple Regression
Correlation  and Simple RegressionCorrelation  and Simple Regression
Correlation and Simple Regression
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability Distributions
 

Similar to Single linear regression

What is a Single Linear Regression
What is a Single Linear RegressionWhat is a Single Linear Regression
What is a Single Linear RegressionKen Plummer
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.ppt
letayh2016
 
Role of regression in statistics (2)
Role of regression in statistics (2)Role of regression in statistics (2)
Role of regression in statistics (2)
Nadeem Uddin
 
variance ( STAT).pptx
variance ( STAT).pptxvariance ( STAT).pptx
variance ( STAT).pptx
MarkMontederamos
 
Topic 5 Covariance & Correlation.pptx
Topic 5  Covariance & Correlation.pptxTopic 5  Covariance & Correlation.pptx
Topic 5 Covariance & Correlation.pptx
CallplanetsDeveloper
 
Topic 5 Covariance & Correlation.pptx
Topic 5  Covariance & Correlation.pptxTopic 5  Covariance & Correlation.pptx
Topic 5 Covariance & Correlation.pptx
CallplanetsDeveloper
 
Class 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxClass 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptx
CallplanetsDeveloper
 
Lecture 4
Lecture 4Lecture 4
L1 updated introduction.pptx
L1 updated introduction.pptxL1 updated introduction.pptx
L1 updated introduction.pptx
MesfinTadesse8
 
Regression-Sheldon Ross from Chapter 9-year2024
Regression-Sheldon Ross from Chapter 9-year2024Regression-Sheldon Ross from Chapter 9-year2024
Regression-Sheldon Ross from Chapter 9-year2024
RAVI PRASAD K.J.
 
Point biserial correlation
Point biserial correlationPoint biserial correlation
Point biserial correlation
Ken Plummer
 
1. A small accounting firm pays each of its five clerks $35,000, t.docx
1. A small accounting firm pays each of its five clerks $35,000, t.docx1. A small accounting firm pays each of its five clerks $35,000, t.docx
1. A small accounting firm pays each of its five clerks $35,000, t.docx
SONU61709
 
Correlations
CorrelationsCorrelations
What is a Point Biserial Correlation?
What is a Point Biserial Correlation?What is a Point Biserial Correlation?
What is a Point Biserial Correlation?
Ken Plummer
 
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Predictive or arbitrary
Predictive or arbitraryPredictive or arbitrary
Predictive or arbitrary
Ken Plummer
 
wk-2.pptx
wk-2.pptxwk-2.pptx
wk-2.pptx
reneejanetubig1
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
nuwan udugampala
 
Correlation
CorrelationCorrelation
Correlation
Nadeem Uddin
 

Similar to Single linear regression (20)

What is a Single Linear Regression
What is a Single Linear RegressionWhat is a Single Linear Regression
What is a Single Linear Regression
 
Statistics For Management 3 October
Statistics For Management 3 OctoberStatistics For Management 3 October
Statistics For Management 3 October
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.ppt
 
Role of regression in statistics (2)
Role of regression in statistics (2)Role of regression in statistics (2)
Role of regression in statistics (2)
 
variance ( STAT).pptx
variance ( STAT).pptxvariance ( STAT).pptx
variance ( STAT).pptx
 
Topic 5 Covariance & Correlation.pptx
Topic 5  Covariance & Correlation.pptxTopic 5  Covariance & Correlation.pptx
Topic 5 Covariance & Correlation.pptx
 
Topic 5 Covariance & Correlation.pptx
Topic 5  Covariance & Correlation.pptxTopic 5  Covariance & Correlation.pptx
Topic 5 Covariance & Correlation.pptx
 
Class 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxClass 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptx
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
L1 updated introduction.pptx
L1 updated introduction.pptxL1 updated introduction.pptx
L1 updated introduction.pptx
 
Regression-Sheldon Ross from Chapter 9-year2024
Regression-Sheldon Ross from Chapter 9-year2024Regression-Sheldon Ross from Chapter 9-year2024
Regression-Sheldon Ross from Chapter 9-year2024
 
Point biserial correlation
Point biserial correlationPoint biserial correlation
Point biserial correlation
 
1. A small accounting firm pays each of its five clerks $35,000, t.docx
1. A small accounting firm pays each of its five clerks $35,000, t.docx1. A small accounting firm pays each of its five clerks $35,000, t.docx
1. A small accounting firm pays each of its five clerks $35,000, t.docx
 
Correlations
CorrelationsCorrelations
Correlations
 
What is a Point Biserial Correlation?
What is a Point Biserial Correlation?What is a Point Biserial Correlation?
What is a Point Biserial Correlation?
 
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
 
Predictive or arbitrary
Predictive or arbitraryPredictive or arbitrary
Predictive or arbitrary
 
wk-2.pptx
wk-2.pptxwk-2.pptx
wk-2.pptx
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
 
Correlation
CorrelationCorrelation
Correlation
 

More from Ken Plummer

Diff rel gof-fit - jejit - practice (5)
Diff rel gof-fit - jejit - practice (5)Diff rel gof-fit - jejit - practice (5)
Diff rel gof-fit - jejit - practice (5)
Ken Plummer
 
Learn About Range - Copyright updated
Learn About Range - Copyright updatedLearn About Range - Copyright updated
Learn About Range - Copyright updated
Ken Plummer
 
Inferential vs descriptive tutorial of when to use - Copyright Updated
Inferential vs descriptive tutorial of when to use - Copyright UpdatedInferential vs descriptive tutorial of when to use - Copyright Updated
Inferential vs descriptive tutorial of when to use - Copyright Updated
Ken Plummer
 
Diff rel ind-fit practice - Copyright Updated
Diff rel ind-fit practice - Copyright UpdatedDiff rel ind-fit practice - Copyright Updated
Diff rel ind-fit practice - Copyright Updated
Ken Plummer
 
Normal or skewed distributions (inferential) - Copyright updated
Normal or skewed distributions (inferential) - Copyright updatedNormal or skewed distributions (inferential) - Copyright updated
Normal or skewed distributions (inferential) - Copyright updated
Ken Plummer
 
Normal or skewed distributions (descriptive both2) - Copyright updated
Normal or skewed distributions (descriptive both2) - Copyright updatedNormal or skewed distributions (descriptive both2) - Copyright updated
Normal or skewed distributions (descriptive both2) - Copyright updated
Ken Plummer
 
Nature of the data practice - Copyright updated
Nature of the data practice - Copyright updatedNature of the data practice - Copyright updated
Nature of the data practice - Copyright updated
Ken Plummer
 
Nature of the data (spread) - Copyright updated
Nature of the data (spread) - Copyright updatedNature of the data (spread) - Copyright updated
Nature of the data (spread) - Copyright updated
Ken Plummer
 
Mode practice 1 - Copyright updated
Mode practice 1 - Copyright updatedMode practice 1 - Copyright updated
Mode practice 1 - Copyright updated
Ken Plummer
 
Nature of the data (descriptive) - Copyright updated
Nature of the data (descriptive) - Copyright updatedNature of the data (descriptive) - Copyright updated
Nature of the data (descriptive) - Copyright updated
Ken Plummer
 
Dichotomous or scaled
Dichotomous or scaledDichotomous or scaled
Dichotomous or scaled
Ken Plummer
 
Skewed less than 30 (ties)
Skewed less than 30 (ties)Skewed less than 30 (ties)
Skewed less than 30 (ties)
Ken Plummer
 
Skewed sample size less than 30
Skewed sample size less than 30Skewed sample size less than 30
Skewed sample size less than 30
Ken Plummer
 
Ordinal (ties)
Ordinal (ties)Ordinal (ties)
Ordinal (ties)
Ken Plummer
 
Ordinal and nominal
Ordinal and nominalOrdinal and nominal
Ordinal and nominal
Ken Plummer
 
Relationship covariates
Relationship   covariatesRelationship   covariates
Relationship covariates
Ken Plummer
 
Relationship nature of data
Relationship nature of dataRelationship nature of data
Relationship nature of data
Ken Plummer
 
Number of variables (predictive)
Number of variables (predictive)Number of variables (predictive)
Number of variables (predictive)
Ken Plummer
 
Levels of the iv
Levels of the ivLevels of the iv
Levels of the iv
Ken Plummer
 
Independent variables (2)
Independent variables (2)Independent variables (2)
Independent variables (2)
Ken Plummer
 

More from Ken Plummer (20)

Diff rel gof-fit - jejit - practice (5)
Diff rel gof-fit - jejit - practice (5)Diff rel gof-fit - jejit - practice (5)
Diff rel gof-fit - jejit - practice (5)
 
Learn About Range - Copyright updated
Learn About Range - Copyright updatedLearn About Range - Copyright updated
Learn About Range - Copyright updated
 
Inferential vs descriptive tutorial of when to use - Copyright Updated
Inferential vs descriptive tutorial of when to use - Copyright UpdatedInferential vs descriptive tutorial of when to use - Copyright Updated
Inferential vs descriptive tutorial of when to use - Copyright Updated
 
Diff rel ind-fit practice - Copyright Updated
Diff rel ind-fit practice - Copyright UpdatedDiff rel ind-fit practice - Copyright Updated
Diff rel ind-fit practice - Copyright Updated
 
Normal or skewed distributions (inferential) - Copyright updated
Normal or skewed distributions (inferential) - Copyright updatedNormal or skewed distributions (inferential) - Copyright updated
Normal or skewed distributions (inferential) - Copyright updated
 
Normal or skewed distributions (descriptive both2) - Copyright updated
Normal or skewed distributions (descriptive both2) - Copyright updatedNormal or skewed distributions (descriptive both2) - Copyright updated
Normal or skewed distributions (descriptive both2) - Copyright updated
 
Nature of the data practice - Copyright updated
Nature of the data practice - Copyright updatedNature of the data practice - Copyright updated
Nature of the data practice - Copyright updated
 
Nature of the data (spread) - Copyright updated
Nature of the data (spread) - Copyright updatedNature of the data (spread) - Copyright updated
Nature of the data (spread) - Copyright updated
 
Mode practice 1 - Copyright updated
Mode practice 1 - Copyright updatedMode practice 1 - Copyright updated
Mode practice 1 - Copyright updated
 
Nature of the data (descriptive) - Copyright updated
Nature of the data (descriptive) - Copyright updatedNature of the data (descriptive) - Copyright updated
Nature of the data (descriptive) - Copyright updated
 
Dichotomous or scaled
Dichotomous or scaledDichotomous or scaled
Dichotomous or scaled
 
Skewed less than 30 (ties)
Skewed less than 30 (ties)Skewed less than 30 (ties)
Skewed less than 30 (ties)
 
Skewed sample size less than 30
Skewed sample size less than 30Skewed sample size less than 30
Skewed sample size less than 30
 
Ordinal (ties)
Ordinal (ties)Ordinal (ties)
Ordinal (ties)
 
Ordinal and nominal
Ordinal and nominalOrdinal and nominal
Ordinal and nominal
 
Relationship covariates
Relationship   covariatesRelationship   covariates
Relationship covariates
 
Relationship nature of data
Relationship nature of dataRelationship nature of data
Relationship nature of data
 
Number of variables (predictive)
Number of variables (predictive)Number of variables (predictive)
Number of variables (predictive)
 
Levels of the iv
Levels of the ivLevels of the iv
Levels of the iv
 
Independent variables (2)
Independent variables (2)Independent variables (2)
Independent variables (2)
 

Recently uploaded

Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
JezreelCabil2
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
AG2 Design
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 

Recently uploaded (20)

Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 

Single linear regression

  • 1. Single Linear Regression Conceptual Explanation
  • 2. • Welcome to this explanation of Single Linear Regression.
  • 3. • Welcome to this explanation of Single Linear Regression. • Single linear regression is an extension of correlation.
  • 4. • Welcome to this explanation of Single Linear Regression. • Single linear regression is an extension of correlation. Correlation extends to Single Linear Regression
  • 5. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables
  • 6. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables
  • 7. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables
  • 8. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables As one variable increases the other increases +.99
  • 9. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables As one variable increases the other increases +.99 This coefficient represents an almost perfect positive correlation or relationship between these two variables.
  • 10. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900
  • 11. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases
  • 12. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases -.99
  • 13. • Correlation is designed to render a single coefficient that represents the degree of coherence between two variables Ave Daily Temp 500 600 700 800 900 As one variable decreases the other increases -.99 Almost a perfect negative correlation or relationship between these two variables.
  • 14. • Single linear regression uses that information to predict the value of one variable based on the given value of the other variable.
  • 15. • Single linear regression uses that information to predict the value of one variable based on the given value of the other variable.
  • 16. • Single linear regression uses that information to predict the value of one variable based on the given value of the other variable. • For example:
  • 17. • For example: If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?
  • 18. • For example: If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales ? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500
  • 19. • Single linear regression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).
  • 20. • Single linear regression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).
  • 21. If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales 630? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500 • Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales).
  • 22. If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000? Ave Daily Ice Cream Sales 630? 560 480 350 320 230 Ave Daily Temp 1000 900 800 700 600 500 • Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales). • Linear Regression makes it possible to estimate a value like 630
  • 23. • In some cases which variable is considered predictor or outcome is arbitrary.
  • 24. • In some cases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety
  • 25. • In some cases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26
  • 26. • In some cases which variable is considered predictor or outcome is arbitrary. • Like measures of depression and anxiety Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26 • It’s not clear which influences which. Most likely depression and anxiety mutually influence one another.
  • 27. • In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.
  • 28. • In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs
  • 29. • In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs Levels of Vitamin E after two months 10.3 units 8.1 units 7.3 units 7.0 units 6.8 units 5.7 units
  • 30. • In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome. Ave Daily Exposure to Sunlight 3.3 hrs 2.6 hrs 2.2 hrs 1.4 hrs 1.2 hrs 0.6 hrs Levels of Vitamin E after two months 10.3 units 8.1 units 7.3 units 7.0 units 6.8 units 5.7 units In this example, exposure to sunlight may impact levels of Vitamin E. But, levels of Vitamin E would not impact the amount of sunlight one gets.
  • 31. • An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space.
  • 32. • An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space. Let’s plot the following data set:
  • 33. • An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space. Let’s plot the following data set: Composite Depression Score 33 26 22 14 12 6 Composite Anxiety Score 103 100 92 74 52 26
  • 34. • First, we assign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression.
  • 35. • First, we assign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 36. • ... and the outcome variable along the Y axis we’ll arbitrarily say is Anxiety.
  • 37. • ... and the outcome variable along the Y axis we’ll arbitrarily say is Anxiety. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 38. • Now, let’s identify or plot each point or dot
  • 39. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 40. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 41. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (33, 103) 0 10 20 30 40 Anxiety Depression
  • 42. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 43. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anx(i2e6ty, 100) 0 10 20 30 40 Anxiety Depression
  • 44. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 45. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression &( 2A2n,x 9ie2t)y 0 10 20 30 40 Anxiety Depression
  • 46. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 47. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Dep(r1e4s,s i7o4n) & Anxiety 0 10 20 30 40 Anxiety Depression
  • 48. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 49. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (12, 52) 0 10 20 30 40 Anxiety Depression
  • 50. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26
  • 51. • Now, let’s identify or plot each point or dot Depression 33 26 22 14 12 6 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety (6, 26) 0 10 20 30 40 Anxiety Depression
  • 52. • Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related.
  • 53. • Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 54. • Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression In this case there is a strong tendency to relate and the relationship is positive
  • 55. • With this data set the tendency for the variables to relate is strong and the direction is negative:
  • 56. • With this data set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26
  • 57. • With this data set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 58. • With this data set the tendency for the variables to relate is strong and the direction is negative: Depression 6 12 14 22 26 33 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Strong and Negative
  • 59. • When no relationship exists the scatter plot tends to look like a big circle.
  • 60. • When no relationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26
  • 61. • When no relationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 62. • When no relationship exists the scatter plot tends to look like a big circle. Depression 22 33 12 6 14 26 Anxiety 103 100 92 74 52 26 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 63. • When no relationship exists the scatter plot tends to look like a big circle. Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26
  • 64. • When no relationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26
  • 65. • When no relationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 22 6 33 26 14 12 Anxiety 103 100 92 74 52 26 Weak and Positive
  • 66. • When no relationship exists the scatter plot tends to look like a big circle. Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26
  • 67. • When no relationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26
  • 68. • When no relationship exists the scatter plot tends to look like a big circle. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression Depression 6 14 33 26 12 22 Anxiety 103 100 74 92 52 26 Weak and Negative
  • 69. • You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other.
  • 70. • You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other. 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression
  • 71. • You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other. Relationship between Depression & Anxiety Weak and Negative 120 100 80 60 40 20 0 Relationship between Depression & Anxiety 0 10 20 30 40 Anxiety Depression 120 100 80 60 40 20 0 0 10 20 30 40 Anxiety Depression Weak and Positive
  • 72. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature).
  • 73. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction.
  • 74. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.
  • 75. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.
  • 76. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction. • One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99).
  • 77. • As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more accurate the prediction. • The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction. • One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99). • Another way to represent it is by graphing the relationship.
  • 78. • Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0).
  • 79. • Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0). [Y= intercept + (slope ∙ X)]
  • 80. • Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0). [Y= intercept + (slope ∙ X)] 6 5 4 3 2 1 0 0 1 2 3 4 5 6
  • 81. • In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.
  • 82. • In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6
  • 83. • In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 • So the equation for this line so far would look like this:
  • 84. • In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 • So the equation for this line so far would look like this: 풚 = 0 + 1 1 풙
  • 85. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙
  • 86. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙 This is where the line crosses the Y axis.
  • 87. 6 5 4 3 2 1 0 rise 1 run 1 0 1 2 3 4 5 6 풚 = 0 + 1 1 풙 This is the slope which is the rise over the run.
  • 88. • A line represents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value.
  • 89. • A line represents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value. Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560
  • 90. • In this case the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship.
  • 91. • In this case the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship. 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 92. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July
  • 93. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ?
  • 94. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:
  • 95. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥
  • 96. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 • There are many ways to write this equation. Here is one way:
  • 97. • Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July Feb Mar Apr May Jun JUL Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 ? • Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction: 푦 = 풚 풊풏풕풆풓풄풆풑풕 + 풔풍풐풑풆(푥 • There are many ways to write this equation. Here is one way: 푦 = 풃 +풎(푥
  • 98. • Using this data set we can create a formula for a straight line that represents that relationship:
  • 99. • Using this data set we can create a formula for a straight line that represents that relationship: Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560
  • 100. • Using this data set we can create a formula for a straight line that represents that relationship: Feb Mar Apr May Jun Ave Monthly Temperature 500 600 700 800 900 Ave Monthly Ice Cream Sales 239 320 400 480 560 700 600 500 400 300 200 100 0 푦= -162+8(푥) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 101. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:
  • 102. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦
  • 103. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 8(100) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 104. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 8(100) 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 105. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = -162 + 800 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 106. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 푦 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 107. • With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be: Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 108. • So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales
  • 109. • So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales • This is a simple demonstration of how regression works.
  • 110. • So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638. Feb Mar Apr May Jun Jul Ave Monthly Temperature 500 600 700 800 900 1000 Ave Monthly Ice Cream Sales 239 320 400 480 560 638 700 600 500 400 300 200 100 0 푦̂ = 638 0 20 40 60 80 100 120 Ave Monthly Temperature Average Monthly Ice Cream Sales • This is a simple demonstration of how regression works. • In reality, however, most variables will not correlate so perfectly like this did:
  • 111. • Most will look like this:
  • 112. • Most will look like this:
  • 113. • Most will look like this: • This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line:
  • 114. • Most will look like this: • This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line: 푦= -50.93+7.21(x)
  • 115. • Most will look like this: • This equation is calculated by using the standard deviations and means of the two variables. For brevity sake we will not go into this here. 푦= -50.93+7.21(x)
  • 116. • Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.
  • 117. • Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.
  • 118. • Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X. This line is the predicted values of Y calculated from the equation 푦 = 푏 + 푚푥
  • 119. • Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X. This line is the predicted values of Y calculated from the equation 푦 = 푏 + 푚푥 These dots represent the actual data
  • 120. • We don’t have to actually plot the coordinates and lines. We can operate solely on the equations to generate predicted values and errors in prediction. In this way we can determine if temperature is a statistically significant predictor of ice cream sales.
  • 121. • So here are the actual data we plotted the data from:
  • 122. • So here are the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122
  • 123. • So here are the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122
  • 124. • So here are the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation:
  • 125. • So here are the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation: 푦 = -50.93+7.21(x)
  • 126. • So here are the actual data we plotted the data from: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 • We can now plot the predicted Y using the equation: 푦 = -50.93+7.21(x) • Which is the equation for the best fitting line between these two variables:
  • 127. • We can now plot the predicted Y using the equation:
  • 128. • We can now plot the predicted Y using the equation: 푦 = -50.93+7.21(x)
  • 129. • We can now plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x)
  • 130. • We can now plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x) 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) =
  • 131. • We can now plot the predicted Y using the equation: (X) Ave Monthly Temp (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(x) 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27
  • 132. • We can now plot the predicted Y using the equation: (X) Ave Monthly Temp 푦 = -50.93+7.21(x) (y) Actual Ave Monthly Ice Cream Sales Jan 40 300 Feb 50 320 Mar 60 370 Apr 70 480 May 80 560 Jun 90 640 Jul 100 720 Aug 90 600 Sep 80 400 Oct 60 300 Nov 40 200 Dec 20 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 • With this information we can now determine if x (temperature) is a statistically significant predictor of “y” (ice cream sales).
  • 133. • To begin we need to determine the total sum of squares just like we would do with analysis of variance.
  • 134. • To begin we need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year.
  • 135. • To begin we need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year. • The mean is calculated by adding up the values and divided them by how many there are.
  • 136. • To begin we need to determine the total sum of squares just like we would do with analysis of variance. • This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year. • The mean is calculated by adding up the values and divided them by how many there are. • (300+320+370+480+560+640+720+600+400+300+200+122) / 12 = 417 average ice cream sales
  • 137. • We then subtract each y value from the mean
  • 138. • We then subtract each y value from the mean (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 139. • We then subtract each y value from the mean (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = • Note - if we did not know the functional relationship between X and Y, our best prediction of any one person’s Y value would be the mean of Y.
  • 140. • Because we are calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores.
  • 141. • Because we are calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores. (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025
  • 142. • Because we are calculating the total sum of squares we will need to square the results and then sum up the results
  • 143. • Because we are calculating the total sum of squares we will need to square the results and then sum up the result (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = = Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 Sum up SUM 372844
  • 144. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.
  • 145. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater.
  • 146. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070.
  • 147. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 148. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 149. • Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small. • Let’s see if the residual or the regression is greater. • We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio Significance Regression ? Residual (error) ? Total 372,844
  • 150. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844.
  • 151. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values
  • 152. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - -
  • 153. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 154. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 155. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 156. • Before we calculate residual and regression let’s see visually how we calculated the total sums of squares - 372,844. • Once again we subtract the actual Y values from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 157. • The first data set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales)
  • 158. • The first data set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales) (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 - - - - - - - - - - - - 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam
  • 159. • Here is the graphic depiction of our subtracting each data point from the mean (417):
  • 160. • Here is the graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417
  • 161. • Here is the graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 162. • Here is the graphic depiction of our subtracting each data point from the mean (417): 122 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 122 -417 = -295 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 163. • Here is the graphic depiction of our subtracting each data point from the mean (417): 200 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = -217 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 164. • Here is the graphic depiction of our subtracting each data point from the mean (417): 200 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = -217 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 165. • Here is the graphic depiction of our subtracting each data point from the mean (417): 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = +303 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 166. • Here is the graphic depiction of our subtracting each data point from the mean (417): 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam 200 -417 = +303 417 (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 167. • Now we have the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417) 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 168. • Now we have the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417) 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 417 417 417 417 417 417 417 417 417 417 417 417 Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 - - - - - - - - - - - - = = = = = = = = = = = =
  • 169. • As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero.
  • 170. • As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844
  • 171. • As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844 • We are doing all this once again to show a visual depiction of what the total sums of squares are:
  • 172. • As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero. Difference -117 -97 -47 63 143 223 303 183 -17 -117 -217 -295 Squared 13689 9409 2209 3969 20449 49729 91809 33489 289 13689 47089 87025 SUM = 0 SUM = 372,844 • We are doing all this once again to show a visual depiction of what the total sums of squares are: Sum of Squares df Mean Square F-ratio Significance Total 372,844
  • 173. • Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression.
  • 174. • Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression. Sum of Squares df Mean Square F-ratio Significance Regression Residual Total 372,844
  • 175. • Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression. Sum of Squares df Mean Square F-ratio Significance Regression Residual Total 372,844 • Let’s calculate the error or residual sums of squares now.
  • 176. • The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.
  • 177. • The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values
  • 178. • The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 179. • The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value. • Here are the actual Y values 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122
  • 180. • Here are the predicted values using the linear regression formula:
  • 181. • Here are the predicted values using the linear regression formula: 800 700 600 500 400 300 200 100 0 These are the actual Y values or average ice cream sales 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27
  • 182. • Here are the predicted values using the linear regression formula: (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 183. • Here are the predicted values using the linear regression formula: (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(320) = 푦 = -50.93+7.21(370) = 푦 = -50.93+7.21(480) = 푦 = -50.93+7.21(560) = 푦 = -50.93+7.21(640) = 푦 = -50.93+7.21(720) = 푦 = -50.93+7.21(600) = 푦 = -50.93+7.21(400) = 푦 = -50.93+7.21(300) = 푦 = -50.93+7.21(200) = 푦 = -50.93+7.21(122) = (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 184. • From these points and the linear regression formula a line can be drawn
  • 185. • From these points and the linear regression formula a line can be drawn 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 186. • From these points and the linear regression formula a line can be drawn 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 187. • The difference between each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line.
  • 188. • The difference between each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales Difference Difference
  • 189. • Let’s subtract the orange actual values and the green line predicted values:
  • 190. • Let’s subtract the orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = +28.73 122 93
  • 191. • Let’s subtract the orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = -125.87 525 400
  • 192. • Let’s subtract the orange actual values and the green line predicted values: 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 average ice cream sales Midterm Exam Final Exam • And so on… (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = -125.87 525 400
  • 193. • We then square those difference (deviations)
  • 194. • We then square those difference (deviations) (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = Squared 3910.00 108.78 136.19 688.01 1164.86 1766.52 2493.00 4.12 15843.26 6669.99 1404.00 825.41
  • 195. • We then square those difference (deviations) and sum them up (y) Actual Ave Monthly Ice Cream Sales 300 320 370 480 560 640 720 600 400 300 200 122 (푦 Predicted Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Difference 62.53 10.43 -11.67 26.23 34.13 42.03 49.93 2.03 -125.87 -81.67 -37.47 28.73 - - - - - - - - - - - - = = = = = = = = = = = = Squared 3910.00 108.78 136.19 688.01 1164.86 1766.52 2493.00 4.12 15843.26 6669.99 1404.00 825.41 Sum up = 35,014
  • 196. Sum of Squares df Mean Square F-ratio Significance Regression Residual 35,014 Total 372,844
  • 197. • We will now calculate the regression sums of squares.
  • 198. • We will now calculate the regression sums of squares. Sum of Squares df Mean Square F-ratio Significance Regression Residual 35,014 Total 372,844 • Our hope is that this value will be much bigger than the residual (35,014).
  • 199. • The regression sums of squares is calculated by subtracting the predicted values from the mean.
  • 200. • The regression sums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line.
  • 201. • The regression sums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 202. • The regression sums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 203. • The regression sums of squares is calculated by subtracting the predicted values from the mean. • Let’s see what this looks like visually. The green line is the predicted values for Y or the regression line. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales The blue line is the mean (417) which is the best predictor absent anything else.
  • 204. • You can probably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844). 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 205. • You can probably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844). • However, we will calculate it the long way so you can see what is happening. 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 206. • We subtract each predicted value from the mean of the actual Y values
  • 207. • We subtract each predicted value from the mean of the actual Y values (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales
  • 208. • We subtract each predicted value from the mean of the actual Y values 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 93 - 417 - 324
  • 209. • We subtract each predicted value from the mean of the actual Y values 800 700 600 500 400 300 200 100 0 0 20 40 60 80 100 120 Midterm Exam Final Exam average ice cream sales (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = 670 - 417 +252
  • 210. • Then we square the differences (or deviations)
  • 211. • Then we square the differences (or deviations) (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = Squared 32470.8 11684.9 1295.76 1303.45 11708 32509.3 63707.4 32509.3 11708 1295.76 32470.8 105233
  • 212. • Then we square the differences (or deviations) and sum them up (y) Actual Ave Monthly Ice Cream Sales 237.47 309.57 381.67 453.77 525.87 597.97 670.07 597.97 525.87 381.67 237.47 93.27 Mean Monthly Ice Cream Sales 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 417.7 Difference -180.2 -108.1 -36.0 36.1 108.2 180.3 252.4 180.3 108.2 -36.0 -180.2 -324.4 - - - - - - - - - - - - = = = = = = = = = = = = Squared 32470.8 11684.9 1295.76 1303.45 11708 32509.3 63707.4 32509.3 11708 1295.76 32470.8 105233 Sum up = 337,830
  • 213. • Then we square the differences (or deviations) and sum them up Sum of Squares df Mean Square F-ratio Significance Regression 337,830 Residual 35,014 Total 372,844
  • 214. • Now we have all of the information to test for significance
  • 215. • Now we have all of the information to test for significance Sum of Squares df Mean Square F-ratio Significance Regression 337,830 Residual 35,014 Total 372,844
  • 216. • The degrees of freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus
  • 217. • The degrees of freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus • 2 parameters -1 = 1 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 Residual 35,014 Total 372,844
  • 218. • The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2)
  • 219. • The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2) • 12 months – 2 parameters (slope / y intercept) = 10
  • 220. • The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2) • 12 months – 2 parameters (slope / y intercept) = 10 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 Residual 35,014 10 Total 372,844
  • 221. • We now have the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom.
  • 222. • We now have the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 =337,830 Residual 35,014 10 =3,501 Total 372,844
  • 223. • The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square
  • 224. • The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square • 337,830 / 3,501 = 96.5
  • 225. • The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square • 337,830 / 3,501 = 96.5 Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 226. • With this information we can turn to the F-distribution table to determine the significance value.
  • 227. • With this information we can turn to the F-distribution table to determine the significance value. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 228. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 229. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The regression degrees of freedom (1) is represented by the columns below:
  • 230. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The regression degrees of freedom (1) is represented by the columns below:
  • 231. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 232. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The residual degrees of freedom (10) is represented by the rows below:
  • 233. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • The residual degrees of freedom (10) is represented by the rows below:
  • 234. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844
  • 235. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • Put them together and we have found the critical F value at the .05 alpha level to be 4.96.
  • 236. Sum of Squares df Mean Square F-ratio Significance Regression 337,830 1 337,830 96.5 Residual 35,014 10 3,501 Total 372,844 • Put them together and we have found the critical F value at the .05 alpha level to be 4.96.
  • 237. • Because the F-ratio (96.5) exceeds the F-critical (4.96) we will reject the null hypothesis and indicate that temperature is a statistically significant predictor of ice cream sales
  • 239. In Summary • The whole point of this demonstration was to
  • 240. In Summary • The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature)
  • 241. In Summary • The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature) (2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error)
  • 242. In Summary • The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature) (2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error) (3) show how this can be used to test whether the prediction is better than by chance.