Regression (aka Least Squares)
Regression is the formal name given to using an existing set of data to project what is going to happen in the future. It is also called “least squares” because it creates a line that is the shortest (least) distance from each data point and the line itself.
This is an example of a good regression line that is fit between the various data points on the x-y axes.
Notice how the thick black line fits nicely between all the points.
Regression – Fits a trend line to a series of historical data points and projects the line into the future for medium- to long-range forecasts. It is also called “least squares” because it creates a line that is the shortest (least) distance from each data point and the line itself.
The equation for determining the least squares line is as follows:
(also known as “Y-hat”) equals the predicted position of Y (the dependent variable) on the Y axis.
X equals the position of X (the independent variable) on the X axis.
a and b modify the height of Y based on the position of X.
How to Solve a Regression Problem:
1. Build a data table including all Xs, Ys, XY, X2, and Y2.
2. Sum each column in the data table—that is, X, Y, XY, X2, and Y2.
3. Insert the summed data into the following equation for b:
4. Insert the summed data into the following equation for a:
5. Insert the solutions for a and b into the following equation:
6. Insert the value of X to solve for Y-hat.
Step-by-Step for Regression
Step 1: Build the grid, inserting variables for X and Y.
Cylinders
City MPG
n
X
Y
1
Audi A4 Avant
4
22
2
Audi A8
8
17
3
BMW 325
6
20
4
BMW 525
6
20
5
BMW 645
8
17
6
Dodge Charger
6
19
7
Dodge Magnum
6
21
8
Ford Explorer
6
15
9
Ford Mustang
6
19
10
Honda Civic
4
30
11
Honda Odyssey
6
19
12
Mazda 3
4
28
13
Mercedes Benz E-Class
6
19
14
Nissan Titan
8
14
15
Nissan Xterra
6
16
16
Scion xB
4
30
17
Toyota Tacoma
4
21
Step 2: Using the Xs and Ys, calculate XY, X2, and Y2.
Cylinders
City MPG
X
Y
XY
X2
Y2
1
Audi A4 Avant
4
22
88
16
484
2
Audi A8
8
17
136
64
289
3
BMW 325
6
20
120
36
400
4
BMW 525
6
20
120
36
400
5
BMW 645
8
17
136
64
289
6
Dodge Charger
6
19
114
36
361
7
Dodge Magnum
6
21
126
36
441
8
Ford Explorer
6
15
90
36
225
9
Ford Mustang
6
19
114
36
361
10
Honda Civic
4
30
120
16
900
11
Honda Odyssey
6
19
114
36
361
12
Mazda 3
4
28
112
16
784
13
Mercedes Benz E-Class
6
19
114
36
361
14
Nissan Titan
8
14
112
64
196
15
Nissan Xterra
6
16
96
36
256
16
Scion xB
4
30
120
16
900
17
Toyota Tacoma
4
21
84
16
441
Step 3: Find the sum of each column.
Cylinders
City MPG
X
Y
XY
X2
Y2
1
Audi A4 Avant
4
22
88
16
484
2
Audi A8
8
17
136
64
289
3
BMW 325
6
20
120
36
400
4
BMW 525
6
20
120
36
400
5
BMW 645
8
17
136
64
289
6
Dodge Charger
6
19
114
36
361
7
Dodge Magnum
6
21
126
36
441
8
Ford Explorer
6
15
90
36
225
9
Ford Mustang
6
19
114
36
361
10
Honda Civic
4
30
120
16
900
11
Honda Odyssey
6
19
114
36
361
12
Mazda 3
4
28
112
16
7 ...
Python Notes for mca i year students osmania university.docx
Regression (aka Least Squares)Regression is the formal name give.docx
1. Regression (aka Least Squares)
Regression is the formal name given to using an existing set of
data to project what is going to happen in the future. It is also
called “least squares” because it creates a line that is the
shortest (least) distance from each data point and the line itself.
This is an example of a good regression line that is fit between
the various data points on the x-y axes.
Notice how the thick black line fits nicely between all the
points.
Regression – Fits a trend line to a series of historical data
points and projects the line into the future for medium- to long-
range forecasts. It is also called “least squares” because it
creates a line that is the shortest (least) distance from each data
point and the line itself.
The equation for determining the least squares line is as
follows:
(also known as “Y-hat”) equals the predicted position of Y (the
dependent variable) on the Y axis.
X equals the position of X (the independent variable) on the X
2. axis.
a and b modify the height of Y based on the position of X.
How to Solve a Regression Problem:
1. Build a data table including all Xs, Ys, XY, X2, and Y2.
2. Sum each column in the data table—that is, X, Y, XY, X2,
and Y2.
3. Insert the summed data into the following equation for b:
4. Insert the summed data into the following equation for a:
5. Insert the solutions for a and b into the following equation:
6. Insert the value of X to solve for Y-hat.
Step-by-Step for Regression
Step 1: Build the grid, inserting variables for X and Y.
3. Cylinders
City MPG
n
X
Y
1
Audi A4 Avant
4
22
2
Audi A8
8
17
3
BMW 325
6
20
4
BMW 525
6
20
5
BMW 645
8
17
6
Dodge Charger
6
19
7
Dodge Magnum
6
5. 30
17
Toyota Tacoma
4
21
Step 2: Using the Xs and Ys, calculate XY, X2, and Y2.
Cylinders
City MPG
X
Y
XY
X2
Y2
1
Audi A4 Avant
4
22
88
16
484
2
Audi A8
8
17
136
64
19. Step 6: Solve for a, using the information in the data grid.
Step 7: Put a and b into the regression equation.
Step 8: To solve for Y, insert X.
If X = 6 (cylinders), then what does Y (MPG) equal?
20. For a 6-cylinder vehicle (X = 6), Y = 19.79 MPG.
^
Y
å
å
-
-
=
2
2
X
n
X
Y
X
n
XY
b
X
b
Y
a
-
=
å
=