1
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Regression/Curve Fitting
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Objectives
• Understanding the difference between
regression and interpolation
• Knowing how to “best fit” a polynomial into
a set of data
• Knowing how to use a polynomial to
interpolate data
2
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Measured Data
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Polynomial Fit!
3
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Line Fit!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Which is better?
4
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Curve Fitting
• If the data measured is of high accuracy
and it is required to estimate the values of
the function between the given points,
then, polynomial interpolation is the best
choice.
• If the measurements are expected to be of
low accuracy, or the number of measured
points is too large, regression would be
the best choice.
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Regression
5
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Why Regression?
• Measurements that we get from real
situations are not usually consistent!
• The number of “pieces” of information that
we can get about a certain project is
HUGE
• You can NEVER measure exact values!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Measured Data
6
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
But, how to get the equation of a
line that is “good” for all the data
you have!
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Equation of a Line: Revision
xaay 10 +=
If you have two points
1101 xaay +=
2102 xaay += 





=












2
1
1
0
2
1
1
1
y
y
a
a
x
x
7
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solving for the constants!
12
12
1
12
2112
0 &
xx
yy
a
xx
yxyx
a
−
−
=
−
−
=
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
What if I have more than two
points?
8
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
For every point
nn xaay
xaay
xaay
10
2102
1101
+≠
+≠
+≠
M 

















≠














1
02
1
2
1
1
1
1
a
a
x
x
x
y
y
y
nn
MMM
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
So, we may write the error vector


















−














=














1
02
1
2
1
2
1
1
1
1
a
a
x
x
x
y
y
y
e
e
e
nnn
MMMM
{ } { } [ ] { } 1*22*1*1* aAye nnn −=
9
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Square the error
{ } { } [ ]{ }aAye −=
{ } { } { } { } { } [ ]{ }
{ } [ ] { } { } [ ] [ ]{ } 2
eaAAayAa
aAyyyee
TTTT
TTT
=+−
−=
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Note: this is a scalar equation!
{ } { } { } { } { } [ ]{ }
{ } [ ] { } { } [ ] [ ]{ } 2
eaAAayAa
aAyyyee
TTTT
TTT
=+−
−=
{ } [ ]{ } { } [ ] { }yAaaAy
TTT
=
{ } { } { } [ ] { } { } [ ] [ ]{ }aAAayAayye
TTTTT
+−= 2
2
10
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Note: this is a quadratic equation in {a}!!!
{ } { } { } [ ] { } { } [ ] [ ]{ }aAAayAayye
TTTTT
+−= 2
2
To minimize the error in the above equation, we need to
differentiate with respect to the parameters
{ }
[ ] { } [ ] [ ]{ } 022
2
=+−= aAAyA
ad
ed TT
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solving the equation
We get:
{ }
[ ] { } [ ] [ ]{ } 022
2
=+−= aAAyA
ad
ed TT
[ ] [ ] { } [ ] { } 1**21*22**2 n
T
nn
T
n yAaAA =
[ ] { } { } 1*21*22*2 yaA = { } [ ] { }yAa
1−
=
11
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Example
• If you are given the
data.
• Find the equation of
the “best-fit” line.
y=a1+a2x
5.57
66
3.55
44
23
2.52
0.51
yx
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution






















=




























5.5
6
5.3
4
2
5.2
5.0
71
61
51
41
31
21
11
1
0
a
a
[ ] { }






















=






















=
5.5
6
5.3
4
2
5.2
5.0
&
71
61
51
41
31
21
11
yA
12
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
[ ] [ ] 





=




























=
14028
287
71
61
51
41
31
21
11
7654321
1111111
AA
T
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
[ ] { }






=




























=
5.119
24
5.5
6
5.3
4
2
5.2
5.0
7654321
1111111
yA
T
13
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution






=












5.119
24
14028
287
1
0
a
a
[ ] [ ]{ } [ ] { }yAaAA
TT
=






=






8393.0
0714.0
1
0
a
a
0714.08393.0 += xy
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Example
• If you are given the
data.
• Find the equation of
the “best-fit” parabola.
y=a0+a1x+a2x2
5.57
66
3.55
44
23
2.52
0.51
yx
14
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution






















=
































5.5
6
5.3
4
2
5.2
5.0
49
36
25
16
9
4
1
71
61
51
41
31
21
11
2
1
0
a
a
a
[ ] { }






















=






















=
5.5
6
5.3
4
2
5.2
5.0
&
49
36
25
16
9
4
1
71
61
51
41
31
21
11
yA
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
[ ] [ ]










=
































=
4676784140
78414028
140287
49
36
25
16
9
4
1
71
61
51
41
31
21
11
49362516941
7654321
1111111
AA
T
15
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
[ ] { }










=
































=
5.665
5.119
24
5.5
6
5.3
4
2
5.2
5.0
49362516941
7654321
1111111
yA
T
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Solution
[ ] [ ]{ } [ ] { }yAaAA
TT
=










−
−
=










0298.0
0774.1
2857.0
2
1
0
a
a
a
2857.00774.10298.0 2
−+−= xxy
16
ENEM602 Spring 2007
Dr. Eng. Mohammad Tawfik
Homework #4
• Chapter 17, pp. 471-472, numbers:
17.4,17.5.
• Use the data and regression to get the
equation of the line that best fits the data
• Number 17.7
• Use the data and regression to get the
equation of the line and the parabola that
best fit the data

06 regression

  • 1.
    1 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Regression/Curve Fitting ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Objectives • Understanding the difference between regression and interpolation • Knowing how to “best fit” a polynomial into a set of data • Knowing how to use a polynomial to interpolate data
  • 2.
    2 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Measured Data ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Polynomial Fit!
  • 3.
    3 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Line Fit! ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Which is better?
  • 4.
    4 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Curve Fitting • If the data measured is of high accuracy and it is required to estimate the values of the function between the given points, then, polynomial interpolation is the best choice. • If the measurements are expected to be of low accuracy, or the number of measured points is too large, regression would be the best choice. ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Regression
  • 5.
    5 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Why Regression? • Measurements that we get from real situations are not usually consistent! • The number of “pieces” of information that we can get about a certain project is HUGE • You can NEVER measure exact values! ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Measured Data
  • 6.
    6 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik But, how to get the equation of a line that is “good” for all the data you have! ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Equation of a Line: Revision xaay 10 += If you have two points 1101 xaay += 2102 xaay +=       =             2 1 1 0 2 1 1 1 y y a a x x
  • 7.
    7 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Solving for the constants! 12 12 1 12 2112 0 & xx yy a xx yxyx a − − = − − = ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik What if I have more than two points?
  • 8.
    8 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik For every point nn xaay xaay xaay 10 2102 1101 +≠ +≠ +≠ M                   ≠               1 02 1 2 1 1 1 1 a a x x x y y y nn MMM ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik So, we may write the error vector                   −               =               1 02 1 2 1 2 1 1 1 1 a a x x x y y y e e e nnn MMMM { } { } [ ] { } 1*22*1*1* aAye nnn −=
  • 9.
    9 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Square the error { } { } [ ]{ }aAye −= { } { } { } { } { } [ ]{ } { } [ ] { } { } [ ] [ ]{ } 2 eaAAayAa aAyyyee TTTT TTT =+− −= ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Note: this is a scalar equation! { } { } { } { } { } [ ]{ } { } [ ] { } { } [ ] [ ]{ } 2 eaAAayAa aAyyyee TTTT TTT =+− −= { } [ ]{ } { } [ ] { }yAaaAy TTT = { } { } { } [ ] { } { } [ ] [ ]{ }aAAayAayye TTTTT +−= 2 2
  • 10.
    10 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Note: this is a quadratic equation in {a}!!! { } { } { } [ ] { } { } [ ] [ ]{ }aAAayAayye TTTTT +−= 2 2 To minimize the error in the above equation, we need to differentiate with respect to the parameters { } [ ] { } [ ] [ ]{ } 022 2 =+−= aAAyA ad ed TT ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Solving the equation We get: { } [ ] { } [ ] [ ]{ } 022 2 =+−= aAAyA ad ed TT [ ] [ ] { } [ ] { } 1**21*22**2 n T nn T n yAaAA = [ ] { } { } 1*21*22*2 yaA = { } [ ] { }yAa 1− =
  • 11.
    11 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Example • If you are given the data. • Find the equation of the “best-fit” line. y=a1+a2x 5.57 66 3.55 44 23 2.52 0.51 yx ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Solution                       =                             5.5 6 5.3 4 2 5.2 5.0 71 61 51 41 31 21 11 1 0 a a [ ] { }                       =                       = 5.5 6 5.3 4 2 5.2 5.0 & 71 61 51 41 31 21 11 yA
  • 12.
    12 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Solution [ ] [ ]       =                             = 14028 287 71 61 51 41 31 21 11 7654321 1111111 AA T ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Solution [ ] { }       =                             = 5.119 24 5.5 6 5.3 4 2 5.2 5.0 7654321 1111111 yA T
  • 13.
    13 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Solution       =             5.119 24 14028 287 1 0 a a [ ] [ ]{ } [ ] { }yAaAA TT =       =       8393.0 0714.0 1 0 a a 0714.08393.0 += xy ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Example • If you are given the data. • Find the equation of the “best-fit” parabola. y=a0+a1x+a2x2 5.57 66 3.55 44 23 2.52 0.51 yx
  • 14.
    14 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Solution                       =                                 5.5 6 5.3 4 2 5.2 5.0 49 36 25 16 9 4 1 71 61 51 41 31 21 11 2 1 0 a a a [ ] { }                       =                       = 5.5 6 5.3 4 2 5.2 5.0 & 49 36 25 16 9 4 1 71 61 51 41 31 21 11 yA ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Solution [ ] [ ]           =                                 = 4676784140 78414028 140287 49 36 25 16 9 4 1 71 61 51 41 31 21 11 49362516941 7654321 1111111 AA T
  • 15.
    15 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Solution [ ] { }           =                                 = 5.665 5.119 24 5.5 6 5.3 4 2 5.2 5.0 49362516941 7654321 1111111 yA T ENEM602 Spring 2007 Dr. Eng. Mohammad Tawfik Solution [ ] [ ]{ } [ ] { }yAaAA TT =           − − =           0298.0 0774.1 2857.0 2 1 0 a a a 2857.00774.10298.0 2 −+−= xxy
  • 16.
    16 ENEM602 Spring 2007 Dr.Eng. Mohammad Tawfik Homework #4 • Chapter 17, pp. 471-472, numbers: 17.4,17.5. • Use the data and regression to get the equation of the line that best fits the data • Number 17.7 • Use the data and regression to get the equation of the line and the parabola that best fit the data