SlideShare a Scribd company logo
Factorization Models & Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines 
Steen Rendle 
Current aliation: Google Inc. 
Work was done at University of Konstanz 
MLConf, November 14, 2014 
Steen Rendle 1 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 2 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
^y(u; i) = ^yu;i = 
Xk 
f =1 
wu;f hi ;f = hwu; hi i 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Rating 
Matrix 
time 
^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i 
^ytimeTF(u; i ; t) := 
Xk 
f =1 
vu;f vi ;f vt;f 
: : : 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Tensor Factorization 
Example for data: Examples for models: 
Triples of Subject, Predicate, Object 
^yPARAFAC(s; p; o) := 
Xk 
f =1 
vs;f vp;f vo;f 
^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi 
: : : 
Steen Rendle 5 / 53 
[illustration from Drumond et al. 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Sequential Factorization Models 
Example for data: Examples for models: 
Bt Bt­3 
b 
b a b 
a c 
User 1 ? 
c 
e 
c c 
a 
? 
d c e e ? 
? 
User 2 
User 3 
User 4 
Bt­2 
Bt­1 
a 
^yFMC(u; i ; t) := 
X 
l2Bt1 
hvi ; vl i 
^yFPMC(u; i ; t) := hvu; vi i + 
X 
l2Bt1 
hvi ; vl i 
: : : 
Steen Rendle 6 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
I Downsides 
I Factorization models are usually build speci
cally for each problem. 
I Learning algorithms and implementations are tailored to individual 
models. 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 8 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Data and Variable Representation 
Many standard ML approaches work with real valued feature vectors as 
input. It allows to represent, e.g.: 
I any number of variables 
I categorical domains by using dummy indicator variables 
I numerical domains 
I set-categorical domains by using dummy indicator variables 
Using this representation allows to apply a wide variety of standard 
models (e.g. linear regression, SVM, etc.). 
Steen Rendle 9 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Linear Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi 
I Model parameters: 
w0 2 R; w 2 Rp 
O(p) model parameters. 
Steen Rendle 10 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Polynomial Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
O(p2) model parameters. 
Steen Rendle 11 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 12 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
, 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
) 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Matrix factorization: ^y(u; i) = hwu; hi i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
I Polynomial regression cannot generalize to any unobserved pairwise 
eect. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 16 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Compared to Polynomial regression: 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 3): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
+ 
Xp 
i=1 
Xp 
ji 
Xp 
lj 
Xk 
f =1 
v(3) 
i ;f v(3) 
j ;f v(3) 
l ;f xi xj xl 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines: Discussion 
I FMs work with real valued input. 
I FMs include variable interactions like polynomial regression. 
I Model parameters for interactions are factorized. 
I Number of model parameters is O(k p) (instead of O(p2) for poly. 
regr.). 
Steen Rendle 18 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 19 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization and Factorization Machines 
Two categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
With this data, the FM is identical to MF with biases1: 
^y(x) = w0 + wu + wi + hvu; vi i | {z } 
MF 
1libFM, k = 128, MCMC inference, Net
ix RMSE=0.8937 
Steen Rendle 20 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
RDF-Triple Prediction with Factorization Machines 
Three categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
S1 S2 S3 ... P1 P2 P3 P4 ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
0 0 0 1 ... 
O1 O2 O3 O4 ... 
Subject Predicate Object 
With this data, the FM is equivalent to the PITF model: 
^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi 
[PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] 
Steen Rendle 21 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time as linear predictor: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
0.2 
0.6 
0.61 
0.3 
0.5 
0.1 
0.8 
Time 
The FM model would correspond to: 
^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time discretized in bins (b(t)): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
1 
0 
0 
1 
0 
1 
0 
0 
1 
1 
0 
1 
0 
0 
Time 
0 
0 
0 
0 
0 
0 
1 
T1 T2 T3 
The FM model would correspond to:2 
^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 
2libFM, k = 128, MCMC inference, Net
ix RMSE=0.8873 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
SVD++ 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0.3 
0.3 
0 
0 
0.5 
0.3 
0.3 
0 
0 
0 
0.3 
0.3 
0.5 
0.5 
0.5 
0 
0 
0.5 
0.5 
0 
... 
... 
... 
... 
... 
0.5 0 0.5 0 ... 
TI NH SW ST ... 
User Movie Other Movies rated 
With this data, the FM3 is identical to: 
^y(x) = 
SVD++ z }| { 
w0 + wu + wi + hvu; vi i + 
1 p 
jNuj 
X 
l2Nu 
hvi ; vl i 
+ 
1 p 
jNuj 
X 
l2Nu 
0 
@wl + hvu; vl i + 
1 p 
jNuj 
X 
l 02Nu ;l 0l 
hvl ; v0 
l i 
1 
A 
3libFM, k = 128, MCMC inference, Net
ix RMSE=0.8865 
Steen Rendle 23 / 53 
[Koren, 2008]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorizing Personalized Markov Chains (FPMC) 
Two categorical variables (u,i ), one set categorical (Bt1): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
u1 u2 u3 ... A B C D ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
1 0 0 0 ... 
User Product 
A B C D ... 
Last Basket 
Sequential Baskets 
u1 A,B C 
u2 C D 
u3 A C 
FM is equivalent to 
^y(x) := w0 + wu + wi + 
1 
jBt1j 
X 
j2Bt1 
wj + hvu; vi i + 
1 
jBt1j 
X 
j2Bt1 
hvi ; vj i + ::: 
Steen Rendle 24 / 53 
[Rendle et al. 2010, WWW Best Paper]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 25 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is 
the number of non-zero elements in vector x. 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
I In the sums over i , only non-zero xi elements have to be summed up 
) O(Nz (x) k). 
I (The complexity of polynomial regression is O(Nz (x)2).) 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
E.g. for second order eects ( = vl ;f ): 
^y(x; vl;f ) := 
g(vl;f )(x) 
z }| { 
w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
j=i+1 
Xk 
f 0=1 
(f 06=f )_(l62fi ;jg) 
vi ;f 0 vj;f 0 xi xj 
+ vl;f xl 
X 
i=1;i6=l 
vi ;f xi 
| {z } 
h(vl;f )(x) 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 29 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning 
Using these properties, learning algorithms can be developed: 
I L2-regularized regression and classi
cation: 
I Stochastic gradient descent [Rendle, 2010] 
I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, 
Rendle 2012] 
I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 
2011, Rendle 2012] 
I L2-regularized ranking: 
I Stochastic gradient descent [Rendle, 2010] 
All the proposed learning algorithms have a runtime of O(k Nz (X) i ), 
where i is the number of iterations and Nz (X) the number of non-zero 
elements in the design matrix X. 
Steen Rendle 30 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Stochastic Gradient Descent (SGD) 
I For each training case (x; y) 2 S, SGD updates the FM model 
parameter  using: 
0 =    
 
(^y(x)  y)h()(x) + () 
 
I  is the learning rate / step size. 
I () is the regularization value of the parameter . 
I SGD can easily be applied to other loss functions. 
Steen Rendle 31 / 53 
[Rendle, 2010]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Coordinate Descent (CD) 
I CD updates each FM model parameter  using: 
0 = 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
P 
(x;y)2S h2 
()(x) + () 
I Using caches of intermediate results, the runtime for updating all 
model parameters is O(k Nz (X)). 
I CD can be extended to classi
cation [Rendle, 2012]. 
Steen Rendle 32 / 53 
[Rendle et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Gibbs Sampling (MCMC) 
I Gibbs sampling with a block for each FM model parameter : 
jS; n fg  N 
  
 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
 
P 
(x;y)2S h2 
()(x) + () 
; 
1 
 
P 
(x;y)2S h2 
()(x) + () 
! 
I Mean is the same as for CD ) computational complexity is also 
O(k Nz (X)). 
I MCMC can be extended to classi
cation using link functions. 
Steen Rendle 33 / 53 
[Freudenthaler et al. 2011, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning Regularization Values 
 v ,v 
yi 
w ,w 
wj 
w0 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
 , 0 ,0 
yi 
wj 
w0 
w 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
w0 ,w0 
0 ,0 
w0 ,w0 
w  v v 
Standard FM with priors. Two level FM with hyperpriors. 
Steen Rendle 34 / 53 
[Freudenthaler et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 35 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
libFM Software 
libFM is an implementation of FMs 
I Model: second-order FMs 
I Learning/ inference: SGD, ALS, MCMC 
I Classi
cation and regression 
I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], 
SVMlight [Joachims]. 
I Supports variable grouping. 
I Open source: GPLv3. 
Steen Rendle 36 / 53 
[http://www.libfm.org/]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 37 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
(Context-aware) Rating Prediction 
I Main variables: 
I User ID (categorical) 
I Item ID (categorical) 
I Additional variables: 
I time 
I mood 
I user pro
le 
I item meta data 
I . . . 
I Examples: Net
ix prize, Movielens, KDDCup 2011 
+ 
♪ + + 
Song 
User Time Mood 
Steen Rendle 38 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Netflix Prize: Prediction Error 
Public Leaderboard 
RMS Error 
0.86 0.87 0.88 0.89 0.90 
user, movie 
user, movie, day 
user, movie, impl. 
user, movie, 
day, impl. 
SGD Matrix 
Factorization 
user, movie, 
day, impl., 
freq, lin. day 
$1M Prize 
I k = 128 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 39 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using user ID and item ID 
Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 
Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 
Matrix Factorization [6] Variational Bayes 30 *0.9141 
Matchbox [15] Variational Bayes 50 *0.9100 
ALS-MF [7] ALS 100 0.9079 
ALS-MF [7] ALS 1000 *0.9018 
SVD/ MF [3] SGD 100 0.9025 
SVD/ MF [3] SGD 200 *0.9009 
Bayesian Probablistic Matrix Factorization 
[13] MCMC 150 0.8965 
(BPMF) 
Bayesian Probablistic Matrix Factorization 
(BPMF) 
[13] MCMC 300 *0.8954 
FM, pred. var: user ID, movie ID - MCMC 128 0.8937 
Models using implicit feedback 
Probabilistic Matrix Factorization with Cons- 
traints 
[14] Batch GD 30 *0.9016 
SVD++ [3] SGD 100 0.8924 
SVD++ [3] SGD 200 *0.8911 
BSRM/F [18] MCMC 100 0.8926 
BSRM/F [18] MCMC 400 *0.8874 
FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using time information 
Bayesian Probabilistic Tensor Factorization 
[17] MCMC 30 *0.9044 
(BPTF) 
FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 
Models using time and implicit feedback 
timeSVD++ [5] SGD 100 0.8805 
timeSVD++ [5] SGD 200 *0.8799 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 
Assorted models 
BRISMF/UM NB corrected [16] SGD 1000 *0.8904 
BMFSI plus side information [8] MCMC 100 *0.8875 
timeSVD++ plus frequencies [4] SGD 200 0.8777 
timeSVD++ plus frequencies [4] SGD 2000 *0.8762 
FM, pred. var: user ID, movie ID, day, impl., 
- MCMC 128 0.8779 
freq., lin. day 
FM, pred. var: user ID, movie ID, day, impl., 
freq., lin. day 
- MCMC 256 0.8771 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 41 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Link Prediction in Social Networks 
I Main variables: 
I Actor A ID 
I Actor B ID 
I Additional variables: 
I pro
les 
I actions 
I . . . 
+ 
Actor A Actor B 
Steen Rendle 42 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 1 
KDDCup 2012 Track 1: Prediction Quality 
Public Leaderboard Private Leaderboard 
Mean Average Precision @3 
0.32 0.34 0.36 0.38 0.40 0.42 
none 
gender, age, ... 
keywords 
friends 
all 
none 
gender, age, ... 
keywords 
friends 
all 
Top 1 
Top 5 
Top 10 
Top 100 
I k = 22 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 43 / 53 
[Awarded 2nd place (out of 658 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 44 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Clickthrough Prediction 
I Main variables: 
I User ID 
I Query ID 
I Ad/ Link ID 
I Additional variables: 
I query tokens 
I user pro
le 
I . . . 
+ 
keyword... + 
Link 1 
Link 2 
Link 3 
User Query Ad/ Link 
Steen Rendle 45 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 2 
Model Inference wAUC (public) wAUC (private) 
ID-based model (k = 0) SGD 0.78050 0.78086 
Attribute-based model (k = 8) MCMC 0.77409 0.77555 
Mixed model (k = 8) SGD 0.79011 0.79321 
Final ensemble n/a 0.79857 0.80178 
Ensemble 
I Rank positions (not predicted clickthrough rates) are used. 
I The MCMC attribute-based model and dierent variations of the 
. SGD models are included. 
Steen Rendle 46 / 53 
[Awarded 3rd place (out of 171 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 47 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
ECML/PKDD Discovery Challenge 2013 
I Problem: Recommend given names. 
I Main variables: 
I User ID 
I Name ID 
I Additional variables: 
I session info 
I string representation for each name 
I . . . 
I FM approach won 1st place (online track) and 2nd (oine track). 
Steen Rendle 48 / 53

More Related Content

What's hot

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Justin Basilico
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Marina Santini
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Understanding GloVe
Understanding GloVeUnderstanding GloVe
Understanding GloVe
JEE HYUN PARK
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Carlos Castillo (ChaTo)
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
 
Bpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedbackBpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedback
Park JunPyo
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Edureka!
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
Chia-Wen Cheng
 
Word2Vec
Word2VecWord2Vec
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
Anuj Gupta
 
Bert
BertBert
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
Balázs Hidasi
 
Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box Models
Jonathan Bechtel
 
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization Machine
InMobi
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
Bruce Kuo
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
Yusuke Yamamoto
 

What's hot (20)

Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Understanding GloVe
Understanding GloVeUnderstanding GloVe
Understanding GloVe
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Bpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedbackBpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedback
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Bert
BertBert
Bert
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box Models
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization Machine
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 

Viewers also liked

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
Liangjie Hong
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
MLconf
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
MLconf
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
MLconf
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
Bartlomiej Twardowski
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SF
MLconf
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
Sri Ambati
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
MLconf
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
sleepy_yoshi
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
MLconf
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
DKALab
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
MLconf
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
Bartlomiej Twardowski
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
MLconf
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
DKALab
 

Viewers also liked (20)

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
 
Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SF
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
 

Similar to Steffen Rendle, Research Scientist, Google at MLconf SF

Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
Evgeniy Marinov
 
PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_Souma
MDO_Lab
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
Francisco (Paco) Florez-Revuelta
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_Souma
MDO_Lab
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботами
Skolkovo Robotics Center
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
Kanika Anand
 
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
The Statistical and Applied Mathematical Sciences Institute
 
FMI output gap
FMI output gapFMI output gap
FMI output gap
ManfredNolte
 
Wp13105
Wp13105Wp13105
Wp13105
ManfredNolte
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
Thanh Hieu
 
Compact 3 d_resist_models_general
Compact 3 d_resist_models_generalCompact 3 d_resist_models_general
Compact 3 d_resist_models_general
Christian Zuniga, PhD
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)
Nidhi Gopal
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
JuanPabloCarbajal3
 
Phd Defense 2007
Phd Defense 2007Phd Defense 2007
Phd Defense 2007
Claudio Siviero
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Souma Chowdhury
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
Mohammad Abdo
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
Mohammad
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
University of Glasgow
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Sergey Staroletov
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
MDO_Lab
 

Similar to Steffen Rendle, Research Scientist, Google at MLconf SF (20)

Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
 
PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_Souma
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_Souma
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботами
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
 
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
 
FMI output gap
FMI output gapFMI output gap
FMI output gap
 
Wp13105
Wp13105Wp13105
Wp13105
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Compact 3 d_resist_models_general
Compact 3 d_resist_models_generalCompact 3 d_resist_models_general
Compact 3 d_resist_models_general
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 
Phd Defense 2007
Phd Defense 2007Phd Defense 2007
Phd Defense 2007
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
MLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
MLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
MLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
MLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
MLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
MLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
MLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

Steffen Rendle, Research Scientist, Google at MLconf SF

  • 1. Factorization Models & Polynomial Regression Factorization Machines Applications Summary Factorization Machines Steen Rendle Current aliation: Google Inc. Work was done at University of Konstanz MLConf, November 14, 2014 Steen Rendle 1 / 53
  • 2. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 2 / 53
  • 3. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 4. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk ^y(u; i) = ^yu;i = Xk f =1 wu;f hi ;f = hwu; hi i k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 5. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i Steen Rendle 4 / 53
  • 6. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Steen Rendle 4 / 53
  • 7. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Rating Matrix time ^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i ^ytimeTF(u; i ; t) := Xk f =1 vu;f vi ;f vt;f : : : Steen Rendle 4 / 53
  • 8. Factorization Models Polynomial Regression Factorization Machines Applications Summary Tensor Factorization Example for data: Examples for models: Triples of Subject, Predicate, Object ^yPARAFAC(s; p; o) := Xk f =1 vs;f vp;f vo;f ^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi : : : Steen Rendle 5 / 53 [illustration from Drumond et al. 2012]
  • 9. Factorization Models Polynomial Regression Factorization Machines Applications Summary Sequential Factorization Models Example for data: Examples for models: Bt Bt­3 b b a b a c User 1 ? c e c c a ? d c e e ? ? User 2 User 3 User 4 Bt­2 Bt­1 a ^yFMC(u; i ; t) := X l2Bt1 hvi ; vl i ^yFPMC(u; i ; t) := hvu; vi i + X l2Bt1 hvi ; vl i : : : Steen Rendle 6 / 53
  • 10. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : Steen Rendle 7 / 53
  • 11. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : I Downsides I Factorization models are usually build speci
  • 12. cally for each problem. I Learning algorithms and implementations are tailored to individual models. Steen Rendle 7 / 53
  • 13. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 8 / 53
  • 14. Factorization Models Polynomial Regression Factorization Machines Applications Summary Data and Variable Representation Many standard ML approaches work with real valued feature vectors as input. It allows to represent, e.g.: I any number of variables I categorical domains by using dummy indicator variables I numerical domains I set-categorical domains by using dummy indicator variables Using this representation allows to apply a wide variety of standard models (e.g. linear regression, SVM, etc.). Steen Rendle 9 / 53
  • 15. Factorization Models Polynomial Regression Factorization Machines Applications Summary Linear Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation: ^y(x) := w0 + Xp i=1 wi xi I Model parameters: w0 2 R; w 2 Rp O(p) model parameters. Steen Rendle 10 / 53
  • 16. Factorization Models Polynomial Regression Factorization Machines Applications Summary Polynomial Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp O(p2) model parameters. Steen Rendle 11 / 53
  • 17. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 12 / 53
  • 18. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User Steen Rendle 13 / 53
  • 19. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User , # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . Steen Rendle 13 / 53
  • 20. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . ) 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Steen Rendle 13 / 53
  • 21. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Steen Rendle 14 / 53
  • 22. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Steen Rendle 14 / 53
  • 23. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Steen Rendle 14 / 53
  • 24. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Matrix factorization: ^y(u; i) = hwu; hi i Steen Rendle 14 / 53
  • 25. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. Steen Rendle 15 / 53
  • 26. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. Steen Rendle 15 / 53
  • 27. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. Steen Rendle 15 / 53
  • 28. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. Steen Rendle 15 / 53
  • 29. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 30. ned; else Steen Rendle 15 / 53
  • 31. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 32. ned; else I Polynomial regression cannot generalize to any unobserved pairwise eect. Steen Rendle 15 / 53
  • 33. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 16 / 53
  • 34. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 35. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Compared to Polynomial regression: I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 36. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 37. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 3): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj + Xp i=1 Xp ji Xp lj Xk f =1 v(3) i ;f v(3) j ;f v(3) l ;f xi xj xl I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 38. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machines: Discussion I FMs work with real valued input. I FMs include variable interactions like polynomial regression. I Model parameters for interactions are factorized. I Number of model parameters is O(k p) (instead of O(p2) for poly. regr.). Steen Rendle 18 / 53
  • 39. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 19 / 53
  • 40. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization and Factorization Machines Two categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie With this data, the FM is identical to MF with biases1: ^y(x) = w0 + wu + wi + hvu; vi i | {z } MF 1libFM, k = 128, MCMC inference, Net ix RMSE=0.8937 Steen Rendle 20 / 53
  • 41. Factorization Models Polynomial Regression Factorization Machines Applications Summary RDF-Triple Prediction with Factorization Machines Three categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... S1 S2 S3 ... P1 P2 P3 P4 ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 1 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 ... ... ... ... ... 0 0 0 1 ... O1 O2 O3 O4 ... Subject Predicate Object With this data, the FM is equivalent to the PITF model: ^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi [PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] Steen Rendle 21 / 53
  • 42. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time as linear predictor: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 0.2 0.6 0.61 0.3 0.5 0.1 0.8 Time The FM model would correspond to: ^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei Steen Rendle 22 / 53
  • 43. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time discretized in bins (b(t)): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 1 0 0 1 0 1 0 0 1 1 0 1 0 0 Time 0 0 0 0 0 0 1 T1 T2 T3 The FM model would correspond to:2 ^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 2libFM, k = 128, MCMC inference, Net ix RMSE=0.8873 Steen Rendle 22 / 53
  • 44. Factorization Models Polynomial Regression Factorization Machines Applications Summary SVD++ 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0.3 0.3 0 0 0.5 0.3 0.3 0 0 0 0.3 0.3 0.5 0.5 0.5 0 0 0.5 0.5 0 ... ... ... ... ... 0.5 0 0.5 0 ... TI NH SW ST ... User Movie Other Movies rated With this data, the FM3 is identical to: ^y(x) = SVD++ z }| { w0 + wu + wi + hvu; vi i + 1 p jNuj X l2Nu hvi ; vl i + 1 p jNuj X l2Nu 0 @wl + hvu; vl i + 1 p jNuj X l 02Nu ;l 0l hvl ; v0 l i 1 A 3libFM, k = 128, MCMC inference, Net ix RMSE=0.8865 Steen Rendle 23 / 53 [Koren, 2008]
  • 45. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorizing Personalized Markov Chains (FPMC) Two categorical variables (u,i ), one set categorical (Bt1): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... u1 u2 u3 ... A B C D ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ... ... ... ... ... 1 0 0 0 ... User Product A B C D ... Last Basket Sequential Baskets u1 A,B C u2 C D u3 A C FM is equivalent to ^y(x) := w0 + wu + wi + 1 jBt1j X j2Bt1 wj + hvu; vi i + 1 jBt1j X j2Bt1 hvi ; vj i + ::: Steen Rendle 24 / 53 [Rendle et al. 2010, WWW Best Paper]
  • 46. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 25 / 53
  • 47. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) Steen Rendle 26 / 53
  • 48. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) Steen Rendle 26 / 53
  • 49. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is the number of non-zero elements in vector x. Steen Rendle 26 / 53
  • 50. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Steen Rendle 27 / 53
  • 51. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 Steen Rendle 27 / 53
  • 52. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 I In the sums over i , only non-zero xi elements have to be summed up ) O(Nz (x) k). I (The complexity of polynomial regression is O(Nz (x)2).) Steen Rendle 27 / 53
  • 53. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . Steen Rendle 28 / 53
  • 54. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . E.g. for second order eects ( = vl ;f ): ^y(x; vl;f ) := g(vl;f )(x) z }| { w0 + Xp i=1 wi xi + Xp i=1 Xp j=i+1 Xk f 0=1 (f 06=f )_(l62fi ;jg) vi ;f 0 vj;f 0 xi xj + vl;f xl X i=1;i6=l vi ;f xi | {z } h(vl;f )(x) Steen Rendle 28 / 53
  • 55. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 29 / 53
  • 56. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Using these properties, learning algorithms can be developed: I L2-regularized regression and classi
  • 57. cation: I Stochastic gradient descent [Rendle, 2010] I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, Rendle 2012] I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 2011, Rendle 2012] I L2-regularized ranking: I Stochastic gradient descent [Rendle, 2010] All the proposed learning algorithms have a runtime of O(k Nz (X) i ), where i is the number of iterations and Nz (X) the number of non-zero elements in the design matrix X. Steen Rendle 30 / 53
  • 58. Factorization Models Polynomial Regression Factorization Machines Applications Summary Stochastic Gradient Descent (SGD) I For each training case (x; y) 2 S, SGD updates the FM model parameter using: 0 = (^y(x) y)h()(x) + () I is the learning rate / step size. I () is the regularization value of the parameter . I SGD can easily be applied to other loss functions. Steen Rendle 31 / 53 [Rendle, 2010]
  • 59. Factorization Models Polynomial Regression Factorization Machines Applications Summary Coordinate Descent (CD) I CD updates each FM model parameter using: 0 = P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () I Using caches of intermediate results, the runtime for updating all model parameters is O(k Nz (X)). I CD can be extended to classi
  • 60. cation [Rendle, 2012]. Steen Rendle 32 / 53 [Rendle et al., 2011]
  • 61. Factorization Models Polynomial Regression Factorization Machines Applications Summary Gibbs Sampling (MCMC) I Gibbs sampling with a block for each FM model parameter : jS; n fg N P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () ; 1 P (x;y)2S h2 ()(x) + () ! I Mean is the same as for CD ) computational complexity is also O(k Nz (X)). I MCMC can be extended to classi
  • 62. cation using link functions. Steen Rendle 33 / 53 [Freudenthaler et al. 2011, Rendle 2012]
  • 63. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Regularization Values  v ,v yi w ,w wj w0 xij i=1,...,n v j  j=1,...,p  , 0 ,0 yi wj w0 w xij i=1,...,n v j  j=1,...,p w0 ,w0 0 ,0 w0 ,w0 w  v v Standard FM with priors. Two level FM with hyperpriors. Steen Rendle 34 / 53 [Freudenthaler et al., 2011]
  • 64. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 35 / 53
  • 65. Factorization Models Polynomial Regression Factorization Machines Applications Summary libFM Software libFM is an implementation of FMs I Model: second-order FMs I Learning/ inference: SGD, ALS, MCMC I Classi
  • 66. cation and regression I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], SVMlight [Joachims]. I Supports variable grouping. I Open source: GPLv3. Steen Rendle 36 / 53 [http://www.libfm.org/]
  • 67. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 37 / 53
  • 68. Factorization Models Polynomial Regression Factorization Machines Applications Summary (Context-aware) Rating Prediction I Main variables: I User ID (categorical) I Item ID (categorical) I Additional variables: I time I mood I user pro
  • 69. le I item meta data I . . . I Examples: Net ix prize, Movielens, KDDCup 2011 + ♪ + + Song User Time Mood Steen Rendle 38 / 53
  • 70. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Netflix Prize: Prediction Error Public Leaderboard RMS Error 0.86 0.87 0.88 0.89 0.90 user, movie user, movie, day user, movie, impl. user, movie, day, impl. SGD Matrix Factorization user, movie, day, impl., freq, lin. day $1M Prize I k = 128 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 39 / 53
  • 71. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using user ID and item ID Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 Matrix Factorization [6] Variational Bayes 30 *0.9141 Matchbox [15] Variational Bayes 50 *0.9100 ALS-MF [7] ALS 100 0.9079 ALS-MF [7] ALS 1000 *0.9018 SVD/ MF [3] SGD 100 0.9025 SVD/ MF [3] SGD 200 *0.9009 Bayesian Probablistic Matrix Factorization [13] MCMC 150 0.8965 (BPMF) Bayesian Probablistic Matrix Factorization (BPMF) [13] MCMC 300 *0.8954 FM, pred. var: user ID, movie ID - MCMC 128 0.8937 Models using implicit feedback Probabilistic Matrix Factorization with Cons- traints [14] Batch GD 30 *0.9016 SVD++ [3] SGD 100 0.8924 SVD++ [3] SGD 200 *0.8911 BSRM/F [18] MCMC 100 0.8926 BSRM/F [18] MCMC 400 *0.8874 FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 Steen Rendle 40 / 53
  • 72. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using time information Bayesian Probabilistic Tensor Factorization [17] MCMC 30 *0.9044 (BPTF) FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 Models using time and implicit feedback timeSVD++ [5] SGD 100 0.8805 timeSVD++ [5] SGD 200 *0.8799 FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 Assorted models BRISMF/UM NB corrected [16] SGD 1000 *0.8904 BMFSI plus side information [8] MCMC 100 *0.8875 timeSVD++ plus frequencies [4] SGD 200 0.8777 timeSVD++ plus frequencies [4] SGD 2000 *0.8762 FM, pred. var: user ID, movie ID, day, impl., - MCMC 128 0.8779 freq., lin. day FM, pred. var: user ID, movie ID, day, impl., freq., lin. day - MCMC 256 0.8771 Steen Rendle 40 / 53
  • 73. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 41 / 53
  • 74. Factorization Models Polynomial Regression Factorization Machines Applications Summary Link Prediction in Social Networks I Main variables: I Actor A ID I Actor B ID I Additional variables: I pro
  • 75. les I actions I . . . + Actor A Actor B Steen Rendle 42 / 53
  • 76. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 1 KDDCup 2012 Track 1: Prediction Quality Public Leaderboard Private Leaderboard Mean Average Precision @3 0.32 0.34 0.36 0.38 0.40 0.42 none gender, age, ... keywords friends all none gender, age, ... keywords friends all Top 1 Top 5 Top 10 Top 100 I k = 22 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 43 / 53 [Awarded 2nd place (out of 658 teams)]
  • 77. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 44 / 53
  • 78. Factorization Models Polynomial Regression Factorization Machines Applications Summary Clickthrough Prediction I Main variables: I User ID I Query ID I Ad/ Link ID I Additional variables: I query tokens I user pro
  • 79. le I . . . + keyword... + Link 1 Link 2 Link 3 User Query Ad/ Link Steen Rendle 45 / 53
  • 80. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 2 Model Inference wAUC (public) wAUC (private) ID-based model (k = 0) SGD 0.78050 0.78086 Attribute-based model (k = 8) MCMC 0.77409 0.77555 Mixed model (k = 8) SGD 0.79011 0.79321 Final ensemble n/a 0.79857 0.80178 Ensemble I Rank positions (not predicted clickthrough rates) are used. I The MCMC attribute-based model and dierent variations of the . SGD models are included. Steen Rendle 46 / 53 [Awarded 3rd place (out of 171 teams)]
  • 81. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 47 / 53
  • 82. Factorization Models Polynomial Regression Factorization Machines Applications Summary ECML/PKDD Discovery Challenge 2013 I Problem: Recommend given names. I Main variables: I User ID I Name ID I Additional variables: I session info I string representation for each name I . . . I FM approach won 1st place (online track) and 2nd (oine track). Steen Rendle 48 / 53
  • 83. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 49 / 53
  • 84. Factorization Models Polynomial Regression Factorization Machines Applications Summary Student Performance Prediction I Main variables: I Student ID I Question ID I Additional variables: I question hierarchy I sequence of questions I skills required I . . . I Examples: KDDCup 2010, Grockit Challenge4 (FM placed 1st/241) + ? Student Question 4http://www.kaggle.com/c/WhatDoYouKnow Steen Rendle 50 / 53
  • 85. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 51 / 53
  • 86. Factorization Models Polynomial Regression Factorization Machines Applications Summary Kaggle Competitions FMs have been successfully applied to several Kaggle competitions: I Criteon Display Advertising Challenge: 1st place (team '3 idiots'). I Blue Book for Bulldozers: 1st place (team 'Leustagos Titericz'). I EMI Music Data Science Hackathon: 2nd place (team 'lns'). Steen Rendle 52 / 53
  • 87. Factorization Models Polynomial Regression Factorization Machines Applications Summary Summary I Factorization machines combine linear/polynomial regression with factorization models. I Feature interactions are learned with a low rank representation. I Estimation of unobserved interactions is possible. I Factorization machines can be computed eciently and have high prediction quality. Steen Rendle 53 / 53
  • 88. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Drumond, S. Rendle, and L. Schmidt-Thieme. Predicting rdf triples in incomplete knowledge bases with tensor factorization. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC '12, pages 326{331, New York, NY, USA, 2012. ACM. C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Bayesian factorization machines. In NIPS workshop on Sparse Representation and Low-rank Approximation, 2011. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative
  • 89. ltering model. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 426{434, New York, NY, USA, 2008. ACM. Y. Koren. The bellkor solution to the net ix grand prize. 2009. Steen Rendle 53 / 53
  • 90. Factorization Models Polynomial Regression Factorization Machines Applications Summary Y. Koren. Collaborative
  • 91. ltering with temporal dynamics. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 447{456, New York, NY, USA, 2009. ACM. Y. J. Lim and Y. W. Teh. Variational Bayesian approach to movie rating prediction. In Proceedings of KDD Cup and Workshop, 2007. I. Pilaszy, D. Zibriczky, and D. Tikk. Fast als-based matrix factorization for explicit and implicit feedback datasets. In RecSys '10: Proceedings of the fourth ACM conference on Recommender systems, pages 71{78, New York, NY, USA, 2010. ACM. I. Porteous, A. Asuncion, and M. Welling. Bayesian matrix factorization with side information and dirichlet process mixtures. In Proceedings of the Twenty-Fourth AAAI Conference on Arti
  • 92. cial Intelligence, AAAI 2010, pages 563{568, 2010. Steen Rendle 53 / 53
  • 93. Factorization Models Polynomial Regression Factorization Machines Applications Summary S. Rendle. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM '10, pages 995{1000, Washington, DC, USA, 2010. IEEE Computer Society. S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1{57:22, May 2012. S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In WWW '10: Proceedings of the 19th international conference on World wide web, pages 811{820, New York, NY, USA, 2010. ACM. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th ACM SIGIR Conference on Reasearch and Development in Information Retrieval. ACM, 2011. R. Salakhutdinov and A. Mnih. Steen Rendle 53 / 53
  • 94. Factorization Models Polynomial Regression Factorization Machines Applications Summary Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning, ICML '08, pages 880{887, New York, NY, USA, 2008. ACM. R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1257{1264, Cambridge, MA, 2008. MIT Press. D. H. Stern, R. Herbrich, and T. Graepel. Matchbox: large scale online bayesian recommendations. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 111{120, New York, NY, USA, 2009. ACM. G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Scalable collaborative
  • 95. ltering approaches for large recommender systems. J. Mach. Learn. Res., 10:623{656, June 2009. Steen Rendle 53 / 53
  • 96. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Xiong, X. Chen, T.-K. Huang, J. Schneider, and J. G. Carbonell. Temporal collaborative
  • 97. ltering with bayesian probabilistic tensor factorization. In Proceedings of the SIAM International Conference on Data Mining, pages 211{222. SIAM, 2010. S. Zhu, K. Yu, and Y. Gong. Stochastic relational models for large-scale dyadic data using MCMC. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1993{2000, 2009. Steen Rendle 53 / 53