lecture02_LinearRegression.pdf

Lecture 2: Linear regression
Machine Learning Course
MIPT, 2019

1. Overview of linear models.
2. Linear regression.
3. Analytical solution.
4. Regularization.
5. Gauss-Markov theorem.
6. Probabilistic interpretation and intuition.
Outline
2

4
Example questions linear regression can solve (up
right picture):
● What will be my monthly spending for the next
year?
● Which factor is more important in deciding my
monthly spending?
● How monthly income and trips per month are
correlated with monthly spending?

6
● Predictive models:
Linear models

7
● Classification models:
Linear models

8
● Unsupervised models (e.g. PCA analysis)
Linear models

9
● Building block of other models (ensembles, NNs, etc.)
Linear models

10
● Building block of other models (ensembles, NNs, etc.)
Linear models
Actually, it’s a neural network. We will meet it later.

11
Linear regression
Linear regression problem statement:
● Training set , where .

12
Linear regression
● Prediction model is linear: ,
Where is weights vector, is bias term.

13
Linear regression
● Prediction model is linear: ,
Where is weights vector, is bias term.
● Least squares method provides a solution:

14
Analytical solution
Denote quadratic loss function: ,
where , .

15
Analytical solution
where , .
To find optimal solution let’s equal to zero the derivative of the equation above:

16
Analytical solution
where , .

17
Analytical solution
where , .

18
Analytical solution
where , .
So the target vector w should be

19
Analytical solution
where , .
So the target vector w should be
what if this matrix is singular?

20
Unstable solution
In case of multicollinear features the matrix is almost singular .
It leads to unstable solution:

21
Unstable solution

22
Unstable solution
the coefficients are huge and sum up to almost 0

23
Regularization
To make the matrix nonsingular, we can add a diagonal matrix:
,
where .
Actually, it’s a solution for the case .

24
Regularization
,
where .
exercise: check it by yourself

25
Regularization
,
where .

Let , where is vector of error random variables.
26
Gauss-Markov theorem

The Gauss–Markov assumptions concern the set of error random variables:
27

28
● They have zero mean:

29
● They are homoscedastic, that is all have the same finite variance:

30
● Distinct error terms are uncorrelated:

31
● Distinct error terms are uncorrelated:
Then the solution delivers BLEU: Best Linear Unbiased Estimator.

Once more: loss functions:
●
32
Regularization terms:
● :
Different norms

●
●
33
● :
● :
Different norms

●
●
34
● :
● :
Different norms
only works for Gauss-Markov theorem

36
What’s the difference?
● MSE
○ delivers BLUE according to
○ differentiable
○ sensitive to noise

37
● MSE
○ differentiable
● MAE
○ non-differentiable
■ (actually, it is differentiable)
○ much more prone to noise

38
● MSE
○ differentiable
● MAE
● regularization
○ constraints weights
○ delivers more stable solution
○ Differentiable

39
● MSE
○ differentiable
● MAE
● regularization
○ Differentiable
● regularization
■ (actually, the same as MAE ;)
○ selects features

40
● MSE
○ differentiable
● MAE
● regularization
○ Differentiable
● regularization
■ (actually, the same as MAE ;)
○ selects features
Does what?

41
source: Modern Multivariate Statistical Techniques
Regularization: illustration

42
source: Modern Multivariate Statistical Techniques
regularization regularization
Regularization: illustration

43
Regularization: probability interpretation
assume w elements
are sampled from
some specific
distribution (prior
distribution for the
weights vector)

44
regularization
regularization
assume w elements
are sampled from
some specific
distribution (prior
weights vector)

45
regularization
regularization
see seminar extra materials for more
assume w elements
are sampled from
some specific
distribution (prior
weights vector)

Welcome to the church of Bayes
46
church of Bayes
Дмитрий Ветров - заведующий Международной лабораторией глубинного обучения и байесовских методов

That’s all. Practice coming next.

lecture02_LinearRegression.pdf

More Related Content

Similar to lecture02_LinearRegression.pdf

More from YuChianWu

Recently uploaded

lecture02_LinearRegression.pdf