Upcoming SlideShare
×

Dm week01 linreg.handout

381 views

Published on

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
381
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide

Dm week01 linreg.handout

1. 1. Christof Monz Informatics Institute University of Amsterdam Data Mining Week 1: Linear Regression Outline Christof Monz Data Mining - Week 1: Linear Regression 1 Plotting real-valued predictions Linear regression Error function
2. 2. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 2 Predict real-values (as opposed to discrete classes) Simple machine learning prediction task Assumes linear correlation between data and target values Scatter Plots Christof Monz Data Mining - Week 1: Linear Regression 3 10 15 20 25 30 35 40 45 10152025303540 x y
3. 3. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 4 Find the line that approximates the data as closely as possible ˆy = a +b ·x where b is the slope, and a is the y-intercept a and b should be chosen such that they minimize the diﬀerence between the predicted values and the values in the training data Error Functions Christof Monz Data Mining - Week 1: Linear Regression 5 There are a number of ways to deﬁne an error function Sum of absolute errors = ∑ i∈D |yi −(a +bxi)| Sum of squared errors = ∑ i∈D (yi −(a +bxi))2 where yi is the true value Squared error is most commonly used Task: Find the parameters a and b that minimize the squared error over the training data
4. 4. Error Functions Christof Monz Data Mining - Week 1: Linear Regression 6 Normalized error functions: Mean squared error = ∑ i∈D (yi −(a+bxi ))2 |D| Relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 where ¯y = 1 |D| ∑i∈D yi Root relative squared error = ∑i∈D(yi −(a+bxi ))2 ∑i∈D(yi −¯y)2 Minimizing Error Functions Christof Monz Data Mining - Week 1: Linear Regression 7 There are roughly two ways: • Try diﬀerent parameter instantiations and see which ones lead to the lowest error (search) • Solve mathematically (closed form) Most parameter estimation problems in machine learning can only be solved by searching For linear regression, we can solve it mathematically
5. 5. Minimizing SSE Christof Monz Data Mining - Week 1: Linear Regression 8 SSE = ∑ i∈D (yi −(a +bxi))2 Take the partial derivatives with respect to a and b Set each partial derivative equal to zero and solve for a and b respectively The resulting values for a and b minimize the error rate and can be used to predict unseen data instances Applying Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 9 For a given training set we ﬁrst compute b: b = |D|∑i∈D xi yi −∑i∈D xi ∑i∈D yi |D|∑i∈D x2 i −(∑i∈D xi )2 and then a, using the value computed for b: a = ¯y −b¯x For any new instances x (i.e. instances that were not in the training set), the predicted value is: a +bx Extendible to multi-valued functions
6. 6. Linear Regression Christof Monz Data Mining - Week 1: Linear Regression 10 Used to predict real-number values, given numerical input variables Parameters can be estimated analytically (i.e. by applying some mathematics), which won’t be the case for most parameter estimation algorithms we’ll see later on Extendible to non-linear functions, e.g. log-linear regression Correlation Christof Monz Data Mining - Week 1: Linear Regression 11 So far we have used linear regression to predict target values (prediction) Linear regression can also be used to determine how closely to variables are correlated (description) The smaller the error rate, the stronger the correlation between the variables Correlation does mean that there is some (interesting relation) between variables (not necessarily causal)
7. 7. Recap Christof Monz Data Mining - Week 1: Linear Regression 12 Linear regression Error rates Analytical parameter estimation