Christof Monz
Informatics Institute
University of Amsterdam
Data Mining
Week 1: Linear Regression
Outline
Christof Monz
Da...
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
2
Predict real-values (as opposed to discrete
clas...
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
4
Find the line that approximates the data as
clos...
Error Functions
Christof Monz
Data Mining - Week 1: Linear Regression
6
Normalized error functions:
Mean squared error = ∑...
Minimizing SSE
Christof Monz
Data Mining - Week 1: Linear Regression
8
SSE = ∑
i∈D
(yi −(a +bxi))2
Take the partial deriva...
Linear Regression
Christof Monz
Data Mining - Week 1: Linear Regression
10
Used to predict real-number values, given
numer...
Recap
Christof Monz
Data Mining - Week 1: Linear Regression
12
Linear regression
Error rates
Analytical parameter estimati...
Upcoming SlideShare
Loading in...5
×

Dm week01 linreg.handout

227

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
227
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Dm week01 linreg.handout"

  1. 1. Christof Monz Informatics Institute University of Amsterdam Data Mining Week 1: Linear Regression Outline Christof Monz Data Mining - Week 1: Linear Regression 1 Plotting real-valued predictions Linear regression Error function
  2. 2. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 2 Predict real-values (as opposed to discrete classes) Simple machine learning prediction task Assumes linear correlation between data and target values Scatter Plots Christof Monz Data Mining - Week 1: Linear Regression 3 10 15 20 25 30 35 40 45 10152025303540 x y
  3. 3. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 4 Find the line that approximates the data as closely as possible ˆy = a +b ·x where b is the slope, and a is the y-intercept a and b should be chosen such that they minimize the difference between the predicted values and the values in the training data Error Functions Christof Monz Data Mining - Week 1: Linear Regression 5 There are a number of ways to define an error function Sum of absolute errors = ∑ i∈D |yi −(a +bxi)| Sum of squared errors = ∑ i∈D (yi −(a +bxi))2 where yi is the true value Squared error is most commonly used Task: Find the parameters a and b that minimize the squared error over the training data
  4. 4. Error Functions Christof Monz Data Mining - Week 1: Linear Regression 6 Normalized error functions: Mean squared error = ∑ i∈D (yi −(a+bxi ))2 |D| Relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 where ¯y = 1 |D| ∑i∈D yi Root relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 Minimizing Error Functions Christof Monz Data Mining - Week 1: Linear Regression 7 There are roughly two ways: • Try different parameter instantiations and see which ones lead to the lowest error (search) • Solve mathematically (closed form) Most parameter estimation problems in machine learning can only be solved by searching For linear regression, we can solve it mathematically
  5. 5. Minimizing SSE Christof Monz Data Mining - Week 1: Linear Regression 8 SSE = ∑ i∈D (yi −(a +bxi))2 Take the partial derivatives with respect to a and b Set each partial derivative equal to zero and solve for a and b respectively The resulting values for a and b minimize the error rate and can be used to predict unseen data instances Applying Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 9 For a given training set we first compute b: b = |D|∑i∈D xi yi −∑i∈D xi ∑i∈D yi |D|∑i∈D x2 i −(∑i∈D xi )2 and then a, using the value computed for b: a = ¯y −b¯x For any new instances x (i.e. instances that were not in the training set), the predicted value is: a +bx Extendible to multi-valued functions
  6. 6. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 10 Used to predict real-number values, given numerical input variables Parameters can be estimated analytically (i.e. by applying some mathematics), which won’t be the case for most parameter estimation algorithms we’ll see later on Extendible to non-linear functions, e.g. log-linear regression Correlation Christof Monz Data Mining - Week 1: Linear Regression 11 So far we have used linear regression to predict target values (prediction) Linear regression can also be used to determine how closely to variables are correlated (description) The smaller the error rate, the stronger the correlation between the variables Correlation does mean that there is some (interesting relation) between variables (not necessarily causal)
  7. 7. Recap Christof Monz Data Mining - Week 1: Linear Regression 12 Linear regression Error rates Analytical parameter estimation

×