2. 2
Motivation
Given a set of experimental data:
x 1 2 3
y 5.1 5.9 6.3
• The relationship between
x and y may not be clear.
• Find a function f(x) that
best fit the data
1 2 3
3. Least Squares
Given is a bivariate dataset (x1, y1), …, (xn, yn), where x1, …, xn are
nonrandom and Yi = α + βxi + Ui are random variables for i = 1, 2, . . .,
n. The random variables U1, U2, …, Un have zero expectation and
variance σ 2
Method of Least Squares: Choose a value for α and β such that
S(α,β)=( ) is minimal.
∑
1
n
( yi− α− βxi)2
4. Regression
The observed value yi corresponding to xi and the value α+βxi on the
regression line y = α + βx.
∑
1
n
( yi− α− βxi)2
5. Estimation
After some calculus magic, we get two equations to estimate α and β:
Method of Least Squares: Choose a value for α and β such that
S(α,β)=( ) is minimal.
∑
1
n
( yi− α− βxi)2
To find the least squares estimates, we differentiate S(α, β) with respect to
α and β, and we set the derivatives equal to 0:
10. 10
Example 1: Linear Regression
n
i
i
i
n
i
i
n
i
i
n
i
i
n
i
i
y
x
b
x
a
x
y
b
x
a
n
bx
a
f(x)
1
1
2
1
1
1
:
Equations
:
Assume
x 1 2 3
y 5.1 5.9 6.3
11. 11
Example 1: Linear Regression
i 1 2 3 sum
xi 1 2 3 6
yi 5.1 5.9 6.3 17.3
xi
2 1 4 9 14
xi yi 5.1 11.8 18.9 35.8
60
.
0
4.5667
:
8
.
35
14
6
3
.
17
6
3
:
Equations
b
a
Solving
b
a
b
a
12. 12
Multiple Linear Regression
Example:
Given the following data:
Determine a function of two variables:
f(x,t) = a + b x + c t
That best fits the data with the least sum of the square of
errors.
t 0 1 2 3
x 0.1 0.4 0.2 0.2
y 3 2 1 2