1. PRESENTATION ON REGRESSION ANALYSIS
PRESENTED BY
NAME: MD. ROKON MIA
ID: 1905031, REG. NO : 000012759
2nd YEAR 2nd SEMESTER
SESSION: 2019-20
DEPT. OF. CSE, BRUR
PRESENTED TO
MARJIA SULTANA
Lecturer
Department of Computer Science &
Engineering
Begum Rokeya University, Rangpur
2. What is Regression Analysis?
Regression analysis is a set of statistical methods used for the estimation of relationships between a dependent variable and
one or more independent variables. It also known as curve fitting.
In the other word “ The process of estimating relationships between dependent or independent variable” .
It can be utilized to assess the strength of the relationship between variables and for modeling the future relationship between
them.
Suppose the value of y for different values of x are given. If we want to know the effect on x on y , we can write a functional
relationship
This relationship may be either linear or non-linear.
where x = independent variable
and y = dependent variable
y=f(x)
3. Types of Regression:
There are several types of regression analysis. Such as
1. Linear Regression
2. Non- linear Regression
4. Determining regression line:
The regression line or line of best fit is an output of regression analysis that represents the relationship
between two or more variables in a data set.
Suppose the relationship between dependent variable 𝑦 and independent variable 𝑥 is 𝑦 = 𝑓(𝑥). For value of 𝑥 = 𝑥𝑖 will
be
𝑦𝑖 = 𝑓 𝑥𝑖 + 𝑒 (where 𝑒 = 𝑒𝑟𝑟𝑜𝑟)
There are various approach determining regression line such as following
1. Minimise the sum of errors, i.e, minimize
𝑒 = 𝑦𝑖 − 𝑓 𝑥𝑖
2. Minimize the sum of absolute values of error i.e ,
𝑒 = (𝑦𝑖−𝑓 𝑥𝑖 )
3. Minimise the sum of squares of errors, i.e,
𝑒2
= 𝑦𝑖 − 𝑓 𝑥𝑖
2
5. Lest square regression:
Definition:
The technique of minimizing the sum or squares errors is known as least square regression.
We use this technique generate equation or fitting regression line.
In this technique, use a necessary condition ,
For each independent variable of 𝑄 = 𝑒2
𝑑𝑄
𝑑𝑃
=0 (𝑤ℎ𝑒𝑟𝑒 𝑝 𝑖𝑠 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒)
6. Linear Regression:
Definition: Linear regression is a approach of modelling relationship between a scaler
response and one or more explanatory variables.
This regression equation estimate base on straight line equation
𝑦 = 𝑚𝑥 + 𝑐
There are two types of regression:
1. Simple linear regression analysis (also called polynomial regression of degree 1) .
2. Multiple linear regression analysis.
7. Uses of Linear Regression:
Linear regression are relatively simple and provide an easy–to–interpret mathematical formula that can generate predict.
Linear regression can be applied to various areas in business and academic study. It also used everything from following
section
1. Biological
2. Environmental
3. Social science
8. Simple Linear Regression Analysis
The relationship between dependent and a single independent is also known as simple linear regression.
Let us consider a mathematical equation for a straight line
𝑦 = 𝑎 + 𝑏𝑥 = 𝑓(𝑥)
to describe the data. Where a is intercept of the line and b its slope.
Now consider a point (𝑥𝑖, 𝑦𝑖). The vertical distance of the point from the line 𝑓 𝑥 = 𝑎 + 𝑏𝑥 with a error 𝑞𝑖
𝑞𝑖 = 𝑦𝑖 − 𝑓 𝑥𝑖 = 𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖
Minimize the sum of the squares of errors
𝑄 =
𝑖=1
𝑛
𝑞𝑖
2
=
𝑖=1
𝑛
(𝑦𝑖−𝑎 − 𝑏𝑖)2
In the method of lest squares, we choose a and b such that Q is minimum. Since 𝑄 is dependent 𝑎 and 𝑏, a necessary
condition for 𝑄 is minimum is
𝑑𝑄
𝑑𝑎
= 0 and
𝑑𝑄
𝑑𝑏
= 0
9. Simple Linear Regression Analysis (Cont.)
Then
𝑑𝑄
𝑑𝑎
= −2 𝑖=1
𝑛
(𝑦𝑖−𝑎 − 𝑏𝑥𝑖) =0
𝑑𝑄
𝑑𝑏
= −2 𝑖=1
𝑛
𝑥𝑖(𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖)=0
Thus
𝑦𝑖 = 𝑛𝑎 + 𝑏 𝑥𝑖 -----------(1)
𝑥𝑖𝑦𝑖 = 𝑎 𝑥𝑖 + 𝑏 𝑥𝑖
2
-----------------------(2)
Now multiplying both sides of (1) by 𝑥𝑖 and multiplying
(2) by 𝑛 and subtract them , we get
𝑥𝑖 𝑦𝑖 − 𝑛 𝑥𝑖𝑦𝑖 = 𝑏 𝑥𝑖
2
− 𝑛 𝑥𝑖
2
𝑎 =
yi
𝑛
−
𝑏 xi
𝑛
= 𝑦 − 𝑏𝑥
𝑏 =
𝑛 𝑥𝑖𝑦𝑖 − xi yi
𝑛 𝑥𝑖
2
− 𝑥𝑖
2
10. Simple Linear Regression Analysis (Cont.)
If 𝑦 = 𝑎 − 𝑏𝑥 then total square error
𝑄 = (𝑦𝑖−𝑎 + 𝑏𝑥𝑖)2
Then
𝑑𝑄
𝑑𝑎
= −2 𝑖=1
𝑛
(𝑦𝑖−𝑎 + 𝑏𝑥𝑖) =0
𝑑𝑄
𝑑𝑏
= −2 𝑖=1
𝑛
𝑥𝑖(𝑦𝑖 − 𝑎 + 𝑏𝑥𝑖)=0
Thus
𝑦𝑖 = 𝑛𝑎 − 𝑏 𝑥𝑖 -----------(1)
𝑥𝑖𝑦𝑖 = 𝑎 𝑥𝑖 − 𝑏 𝑥𝑖
2
-----------------------(2)
Now multiplying both sides of (1) by 𝑥𝑖 and multiplying
(2) by 𝑛 and subtract them , we get
𝑥𝑖 𝑦𝑖 − 𝑛 𝑥𝑖𝑦𝑖 = 𝑏 𝑛 𝑥𝑖
2
− 𝑥𝑖
2
𝑎 =
yi
𝑛
+
𝑏 xi
𝑛
= 𝑦 + 𝑏𝑥
𝑏 =
𝑛 𝑥𝑖𝑦𝑖 − xi yi
𝑥𝑖
2 − 𝑛 𝑥𝑖
2
11. Simple Linear Regression Analysis (Cont.)
Example 1:
Fit a straight line to the following set of data and find value of 𝑦 for 𝑥 = 10
Solution:
The various summation are given as follow:
X 1 2 3 4 5
y 3 4 5 6 7
𝑥𝑖 𝑦𝑖 𝑥𝑖
2 𝑥𝑖𝑦𝑖
1 3 1 3
2 4 4 8
3 5 9 15
4 6 16 24
5 7 25 35
𝑥𝑖 = 15 𝑦𝑖 = 25 𝑥𝑖
2
= 55 𝑥𝑖𝑦𝑖 = 85
12. Simple Linear Regression Analysis (Cont.)
Using equation
𝑏 =
𝑛 𝑥𝑖𝑦𝑖 − xi yi
𝑛 𝑥𝑖
2
− 𝑥𝑖
2
=
5 × 85 − 15 × 25
5 × 55 − 152
= 1
𝑎 =
yi
𝑛
−
𝑏 xi
𝑛
=
15
5
−
1 × 15
5
= 2
Therefor the linear equation is
𝑦 = 2 + 𝑥
If 𝑥 = 10 then 𝑦 = 2 + 10 = 12 (ans)
Example 2: Apartment Rents Data for one-bedroom apartment rent and three-bedroom apartment rent are as follows:
Find y when x $700 per month.
One BR 553 578 891 773 812 509
Three BR 1017 916 1577 1234 1403 857
13. Simple Linear Regression Analysis (Cont.)
Solution:
The various summation are given as follow
.
Using formula we get 𝑏 =
6×3993343−4116×7065
6×2949731− 4116 2 = −6.763481 and 𝑎 =
7065
6
−
−6.763481 ×4116
6
= 5817.248
Therefor straight line equation is 𝑦 = 5817.248 − 6.763481𝑥
If 𝑥 = 700 then 𝑦 = 5817.248 − 6.763481 × 700 = 1082.81 (ans)
One BR (𝑥𝑖) Three BR 𝑦𝑖 𝑥𝑖
2 𝑥𝑖𝑦𝑖
553 1078 305809 596134
578 916 334084 529448
891 1577 793881 1405107
773 1234 597529 953882
812 1403 659344 1139236
509 857 259081 436213
𝑥𝑖 = 4116 𝑦𝑖 = 7065 𝑥𝑖
2
= 2949731 𝑥𝑖𝑦𝑖 = 3993343
14. Simple Linear Regression Analysis (Cont.)
Solution:
The various summation are given as follow
.
Using formula we get 𝑏 =
6×3993343−4116×7065
6×2949731− 4116 2 = −6.763481 and 𝑎 =
7065
6
−
−6.763481 ×4116
6
= 5817.248
Therefor straight line equation is 𝑦 = 5817.248 − 6.763481𝑥
If 𝑥 = 700 then 𝑦 = 5817.248 − 6.763481 × 700 = 1082.81 (ans)
One BR (𝑥𝑖) Three BR 𝑦𝑖 𝑥𝑖
2 𝑥𝑖𝑦𝑖
553 1078 305809 596134
578 916 334084 529448
891 1577 793881 1405107
773 1234 597529 953882
812 1403 659344 1139236
509 857 259081 436213
𝑥𝑖 = 4116 𝑦𝑖 = 7065 𝑥𝑖
2
= 2949731 𝑥𝑖𝑦𝑖 = 3993343
15. Multiple Linear Regression Analysis:
In which regression exist relationship between dependent variable and more than one independent variables are known as
multiple regression.
Multiple linear equation expressed as
𝑦 = 𝑎1 + 𝑎2𝑥1 + 𝑎3𝑥2 + ⋯ + 𝑎𝑚+1𝑥𝑚
Let us consider two independent variables (𝑥, 𝑧) linear function as follows:
𝑦 = 𝑎1 + 𝑎2𝑥 + 𝑎3𝑧
Applying least square regression we have
𝑛𝑎1 + 𝑥𝑖 𝑎2 + 𝑧𝑖 𝑎3 = 𝑦𝑖
𝑥𝑖 𝑎1 + 𝑥𝑖
2
𝑎2 + 𝑥𝑖𝑧𝑖 𝑎3 = 𝑥𝑖𝑦𝑖
𝑧𝑖 𝑎1 + 𝑥𝑖𝑧𝑖 𝑎2 + 𝑧𝑖
2
𝑎3 = 𝑧𝑖𝑦𝑖
16. Multiple Linear Regression Analysis:
Example: Obtain a regression plane for given the data
Solution:
Given data table used two independent variables.
The various sums of power and products are given tabular form as bellow
𝒙𝟏 5 4 3 2 1
𝒙𝟐 3 -2 -1 4 0
𝑦 15 -8 -1 26 8
𝒙𝟏 𝒙𝟐 𝒚 𝒙𝟏
𝟐 𝒙𝟐
𝟐 𝒙𝟏𝒙𝟐 𝒚𝒙𝟏 𝒚𝒙𝟐
5 3 15 25 9 15 75 45
4 -2 -8 16 4 -8 -32 16
3 -1 -1 9 1 -3 -3 1
2 4 26 4 16 8 52 104
1 0 8 54 0 0 8 0
𝑥1 =15 𝑥2 =4 𝑦 =40 𝑥1
2
=108 𝑥2
2
=30 𝑥1𝑥2 =12 𝑦𝑥1 =100 𝑦𝑥2 =166
17. Multiple Linear Regression Analysis:
Substituting these values to get solve multiple linear equation, we get
5𝑎1 + 15𝑎2 + 4𝑎3 = 40
15𝑎1 + 108𝑎2 + 12𝑎3 = 100
4𝑎1 + 12𝑎2 + 30𝑎3 = 166
Solving the value 𝑎1 =
104
21
, 𝑎2 =
−20
63
and 𝑎3 = 5
The multiple linear equation is
𝑦 =
104
21
−
20
63
𝑥1 + 5𝑥2
(Ans)
18. Non-linear Regression:
Definition: Nonlinear regression refers to a regression analysis where the regression model
portrays a nonlinear relationship between dependent and independent variables.
This regression equation estimate base on none linear equation such as
𝑎𝑥2 + 𝑏𝑦2 = 𝑐, 𝑦 = 𝑎𝑥𝑏, 𝑦 = 𝑎𝑒𝑏𝑥, 𝑦 = 𝑎1 + 𝑎2𝑥 + 𝑎3𝑥2 + ⋯ + 𝑎𝑚𝑥𝑚−1 𝑒𝑡𝑐
There are two types of regression:
1. Transcendental regression analysis.
2. Polynomial (degree more than 1) regression analysis.
19. Uses of non-linear regression analysis:
Non linear regression has many various uses such as
1. Determine population.
2. Determine still atom in radioactivity.
3. Determine toxic in environment.
4. Determine area of Deforestation etc
20. Transcendental Regression Analysis
We know that , if a function not algebraic then it’s called transcendental function. Generally transcendental function consist
( trigonometric term, exponential term, logarithm terms etc).
Now consider a transcendental function
𝑦 = 𝑎𝑒𝑏𝑥
taking natural logarithm (𝑙𝑛) both side, we get
ln 𝑦 = ln 𝑎 + 𝑏𝑥 -------------------------------(2)
this equation similar of straight line equation
𝑌 = 𝐴 + 𝐵𝑋 ------------------------(3)
where 𝐴 = ln 𝑎 , 𝑋 = 𝑥 , 𝑌 = ln 𝑦 and 𝐵 = 𝑏
So according to simple linear regression we get
𝐵 =
𝑛 𝑋𝑖𝑌𝑖 − 𝑋i 𝑌i
𝑛 𝑋𝑖
2
− 𝑋𝑖
2
𝑏 =
𝑛 𝑥𝑖 ln 𝑦𝑖 − 𝑥i (𝑙𝑛yi)
𝑛 𝑥𝑖
2
− 𝑥𝑖
2
ln 𝑎 =
ln 𝑦𝑖
𝑛
−
𝑏 xi
𝑛
21. Transcendental Regression Analysis
Example:- Now evaluate the number of atom at 𝑡 = 30 following data table
Solution:
We know that equation of 𝑁 = 𝑁0𝑒−𝜆𝑡
Taking ln on both sides , we get ln 𝑁 = ln 𝑁0 − 𝜆𝑡 ------------------------------------(1)
Now require various summation are given as follow
Time (𝒕) 0.4 0.8 1.2 1.6 2.0 2.4
Number of Atoms (𝑁) 100 80 70 65 62 59
Time (𝒕𝒊) Number of Atoms (𝑵𝒊) 𝒕𝒊
𝟐 ln 𝑵𝒊 𝒕𝒊 ln 𝑵𝒊
0.4 100 0.16 4.60517 1.84202
0.8 80 0.64 4.38202 3.50562
1.2 70 1.44 4.24849 5.09819
1.6 65 2.56 4.17438 6.67901
2.0 62 4.00 4.12713 8.25426
2.4 59 5.76 4.07753 9.78608
𝒕𝒊 = 8.4 𝑵𝒊 = 436 𝒕𝒊
𝟐
= 14.56 ln 𝑵𝒊 = 25.61472 𝒕𝒊 ln 𝑵𝒊 = 35.16518
26. Polynomial Regression Analysis:
Definition: Polynomial regression is a form of linear regression in which relationship between independent variable x and
dependent variable y is modelled as n degree polynomial.
Consider a polynomial a degree of 𝑚 − 1
𝑦 = 𝑎1 + 𝑎2𝑥 + 𝑎3𝑥2
+ ⋯ + 𝑎𝑚𝑥𝑚−1
=
𝑖=1
𝑚
𝑎𝑖𝑥𝑖−1
= 𝑓(𝑥)
if data contains n set of x and y values, then
𝑄 =
𝑖=1
𝑛
𝑦𝑖 − 𝑓(𝑥𝑖) 2
Applying least squares regression we get ,
𝑖=1
𝑛
𝑦𝑖𝑥𝑖
𝑗−1
=
𝑖=1
𝑛
𝑥𝑖
𝑗−1
𝑎1 + 𝑎2𝑥𝑖 + 𝑎3𝑥𝑖
2 + ⋯ + 𝑎𝑚𝑥𝑖
𝑚−1
These are m equations (𝑗 = 1,2,3,4, … . . , 𝑚) . The above equation form is derivative form of Q w.r.t one 𝑎𝑗
We can estimate the polynomial solving the value of constant (𝑎1, 𝑎2, … . . , 𝑎𝑚)
27. Polynomial Regression Analysis:
Fit a third order polynomial to the data in the table bellow:
Solution:
The order of the polynomial is 3 and therefor we will have 4 simultaneous equations as shown bellow
𝑦𝑖 = 𝑎1𝑛 + 𝑎2 𝑥𝑖 + 𝑎3 𝑥𝑖
2 + 𝑎4 𝑥𝑖
3
𝑦𝑖𝑥𝑖 = 𝑎1 𝑥𝑖 + 𝑎2 𝑥𝑖
2 + 𝑎3 𝑥𝑖
3 + 𝑎4 𝑥𝑖
4
𝑦𝑖𝑥𝑖
2
= 𝑎1 𝑥𝑖
2
+ 𝑎2 𝑥𝑖
3
+ 𝑎3 𝑥𝑖
4
+ +𝑎4 𝑥𝑖
5
𝑦𝑖𝑥𝑖
3 = 𝑎1 𝑥𝑖
3 + 𝑎2 𝑥𝑖
4 + 𝑎3 𝑥𝑖
5 + +𝑎4 𝑥𝑖
6
The various sums of powers and products can be evaluated in tabular form as bellow
x 1.0 2.0 3.0 4.0
y 6.0 11.0 18.0 27.0
𝑥𝑖 𝑦𝑖 𝑥𝑖
2
𝑥𝑖
𝟑
𝑥𝑖
𝟒
𝑥𝑖
𝟓 𝑥𝑖𝑦𝑖 𝒚𝑥𝑖
2
𝒚𝑥𝑖
𝟑
𝑥𝑖
𝟔
1.0 6.0 1.0 1.0 1.0 1.0 6 6 6 1.0
2.0 11.0 4.0 8.0 16.0 32.0 22 44 88 64.0
3.0 18.0 9.0 27.0 81.0 243.0 54 162 486 729.0
4.0 27.0 16.0 64.0 256.0 1024.0 108 432 1728 4096.0
𝑥𝑖=10.0 𝑦𝑖= 62.0 𝑥𝑖
2=30.0 𝑥𝑖
3 =100.0 𝑥𝑖
4
=354.0 𝑥𝑖
5=1300.0 𝑦𝑖𝑥𝑖=190.0 𝑦𝑖𝑥𝑖
2=644 𝑦𝑖𝑥𝑖
3=2308 𝑥𝑖
𝟔=489
29. Why is Regression Analysis important?
Regression has a wide range of real-life applications. It is essential for any machine learning problem that
involves continuous numbers – this includes, but is not limited to, a host of examples, including:
• Financial forecasting (like house price estimates, or stock prices)
• Sales and promotions forecasting
• Testing automobiles
• Weather analysis and prediction
• Time series forecasting
• Science ( radioactivity , velocity of sound dependent on temperature)
• Sociology
• Statistics (Determine population, rate of growth etc)
• Economics (Determine demand of product, price of product, economical statistics)