Recommender Systems

Recommender
Systems
João Paulo L. F. Dias da Silva
Oct 2014

Background
(5 min)
Implementation
(5 min)
Demonstration
(5 min)
Agenda

Background 1. Machine Learning Application
• Unsupervised Learning (No
right answers provided)
• Linear Regression
• Gradient Descent Algorithm
2. Content-based Filtering
• Known product features
3. Collaborative Filtering
• Unknown product features
• Features will be “identified”
by the application

Linear Regression
It's a method that allows us to obtain a function that models the
relationship between a scalar dependent variable h and its
explanatory variables X.
Given a dataset {h, x1
, x2
, …, xn
} of statistical units, a linear
regression model assumes that there's a linear relationship
between each variable hi
and its independent variables xi1
, xi2
, …,
xin.
The goal of the linear regression is to obtain a parameter Ɵ so
that the model function h(X) = c + ƟX fits the input dataset as
close as possible.
0 2 4 6 8 10 12 14 16 18
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000

Linear Regression – Model function
hi
= Ɵ1
xi1
+ … + Ɵm
xim
Stacking all examples we can rewrite the above as:
hi
= xi
T
Ɵ
Where X is a nx1 vector with each element being xi
T
and n is the
number of examples of our dataset.
T denotes the transpose operation.
Let h be a function that represents a model for the ith
example of
our dataset:
h = XƟ
Let Ɵ and xi
be mx1 vectors where m is the number of variables
of our model, so hi
becomes:

Linear Regression – Error function
Let h(X) be our hypothesis function (model).
J = (h(X) - Y)2
Let Y be the target values for each example in our dataset.
The squared error function J will be:
Another way of writing the error function is to take into account
the index of each example in our dataset, so that:
m
J = ∑(h(x(i)
) – Y(i)
)2
i=1
The objective of the linear regression is to minimize the error
function J with respect to Ɵ. One way of achieving it is through an
algorithm called Gradient Descent.

Gradient Descent Algorithm
Objective:
Find the minimum values for Ɵ1
,...,Ɵn
that minimize the error
function J(Ɵ1
,...,Ɵn
).
Overview:
• Initialize Ɵ1
,...,Ɵn
with some random values.
• Keep changing Ɵ1
,...,Ɵn
to reduce J(Ɵ1
,...,Ɵn
) until we find a
minimum.
Implementation:
Ɵj
:= Ɵj
– α – ∂ J(Ɵ1
,...,Ɵn
)
∂Ɵj

Gradient Descent
The partial derivative:
m
Ɵj
:= Ɵj
– α 1 – ∑(h(x(i)
) – y(i)
)x(i)
m i=1 j
An example for Ɵ ∈ ℝ3
:
m
Ɵ0
:= Ɵ0
– α 1 – ∑(h(x(i)
) – y(i)
)x(i)
m i=1 0
m
Ɵ1
:= Ɵ1
– α 1 – ∑(h(x(i)
) – y(i)
)x(i)
m i=1 1
m
Ɵ2
:= Ɵ2
– α 1 – ∑(h(x(i)
) – y(i)
)x(i)
m i=2 2

Recommender Systems – Prog. Skills
Skills Ana Beto Carla Daniel
Ruby 5 5 0 0
CSS3 5 ? ? 0
JS ? 4 0 ?
Android 0 0 5 4
iOS 0 0 5 ?
How to predict the values for the unknown skills?

Content-based Filtering
Skills
Ana
Ɵ¹
Beto
Ɵ2 ...
X1
(Web)
X2
(Mobile)
Ruby (X1
) 5 5 ... 0.9 0
CSS3 (X2
) 5 ? ... 1.0 0.01
JS (X3
) ? 4 ... 0.99 0
Android (X4
) 0 0 ... 0.1 1.0
iOS (X5
) 0 0 ... 0 0.9
The skills features are known. Just need to solve one Linear Regression per user.

Content-based Filtering - Predicting
Skills
Ana
Ɵ¹ = [5, 0]
...
X1
(Web)
X2
(Mobile)
Ruby (X1
) 5 ... 0.9 0
CSS3 (X2
) 5 ... 1.0 0.01
JS (X3
) 5 ... 0.99 0
Android (X4
) 0 ... 0.1 1.0
iOS (X5
) 0 ... 0 0.9
Ana(JS) => Ɵ¹ * X3
=> [5, 0] * [0.99, 0] = (5 * 0.99) + (0 * 0) = 5

Collaborative Filtering
Skills
Ana
Ɵ¹
Beto
Ɵ2 ...
X1
(?)
X2
(?)
Ruby (X1
) 5 5 ... ? ?
CSS3 (X2
) 5 ? ... ? ?
JS (X3
) ? 4 ... ? ?
Android (X4
) 0 0 ... ? ?
iOS (X5
) 0 0 ... ? ?
How to predict the values for the unknown skills and features?

Collaborative Filtering – Feature
Learning
We can't find the Ɵ parameters because we don't have the values
for the features vectors.
So we initialize the Ɵ parameters to random values.
Then we can use the Ɵ parameters to apply linear regression in
order to find the features vectors for each skill.
Then we can use the features vectors to apply linear regression to
improve our Ɵ parameters for each user.
We keep doing that until we reach the optimal values for Ɵ and
the features vectors.

Collaborative Filtering – Intuition

Collaborative Filtering - Predicting
Skills
Ana
Ɵ¹ = [5, 0]
... X1 X2
Ruby (X1
) 5 ... 0.9 0
CSS3 (X2
) 5 ... 1.0 0.01
JS (X3
) 5 ... 0.99 0
Android (X4
) 0 ... 0.1 1.0
iOS (X5
) 0 ... 0 0.9
Ana(JS) => Ɵ¹ * X3
=> [5, 0] * [0.99, 0] = (5 * 0.99) + (0 * 0) = 5

Implementa
tion
Python for data scraping
Octave for LR/GD matrix calculations
Missing UI
Hardcoded input

Recommender Systems

More Related Content

What's hot

Viewers also liked

Similar to Recommender Systems

More from João Paulo Leonidas Fernandes Dias da Silva

Recently uploaded

Recommender Systems