SlideShare a Scribd company logo
1 of 37
Download to read offline
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
selivanov.dmitriy@gmail.com
2017-11-16
Recommender systems are everywhere
Figure 1:
Recommender systems are everywhere
Figure 2:
Recommender systems are everywhere
Figure 3:
Recommender systems are everywhere
Figure 4:
Goals
Propose “relevant” items to customers
Retention
Exploration
Up-sale
Personalized offers
recommended items for a customer given history of activities (transactions, browsing
history, favourites)
Similar items
substitutions
bundles - frequently bought together
. . .
Live demo
Dataset - LastFM-360K:
360k users
160k artists
17M observations
sparsity - 0.9999999
Explicit feedback
Ratings, likes/dislikes, purchases:
cleaner data
smaller
hard to collect
RMSE2
=
1
D u,i∈D
(rui − ˆrui )2
Netflix prize
~ 480k users, 18k movies, 100m ratings
sparsity ~ 90%
goal is to reduce RMSE by 10% - from 0.9514 to 0.8563
Implicit feedback
noisy feedback (click, likes, purchases, search, . . . )
much easier to collect
wider user/item coverage
usually sparsity > 99.9%
One-Class Collaborative Filtering
observed entries are positive preferences
should have high confidence
missed entries in matrix are mix of negative preferences and positive preferences
consider them as negative with low confidence
we cannot really distinguish that user did not click a banner because of a lack of
interest or lack of awareness
Evaluation
Recap: we only care about how to produce small set of highly relevant items.
RMSE is bad metrics - very weak connection to business goals.
Only interested about relevance precision of retreived items:
space on the screen is limited
only order matters - most relevant items should be in top
Ranking - Mean average precision
AveragePrecision =
n
k=1
(P(k)×rel(k))
number of relevant documents
## index relevant precision_at_k
## 1: 1 0 0.0000000
## 2: 2 0 0.0000000
## 3: 3 1 0.3333333
## 4: 4 0 0.2500000
## 5: 5 0 0.2000000
map@5 = 0.1566667
Ranking - Normalized Discounted Cumulative Gain
Intuition is the same as for MAP@K, but also takes into account value of relevance:
DCGp =
p
i=1
2reli − 1
log2(i + 1)
nDCGp =
DCGp
IDCGp
IDCGp =
|REL|
i=1
2reli − 1
log2(i + 1)
Approaches
Content based
good for cold start
not personalized
Collaborative filtering
vanilla collaborative fitlering
matrix factorizations
. . .
Hybrid and context aware recommender systems
best of two worlds
Focus today
WRMF (Weighted Regularized Matrix Factorization) - Collaborative Filtering for
Implicit Feedback Datasets (2008)
efficient learning with accelerated approximate Alternating Least Squares
inference time
Linear-FLow - Practical Linear Models for Large-Scale One-Class Collaborative
Filtering (2016)
efficient truncated SVD
cheap cross-validation with full path regularization
Matrix Factorizations
Users can be described by small number of latent factors puk
Items can be described by small number of latent factors qki
Sparse data
items
users
Low rank matrix factorization
R = P × Q
factors
users
items
factors
Reconstruction
items
users
items
users
Truncated SVD
Take k largest singular values:
X ≈ UkDkV T
k
- Xk ∈ Rm∗n - Uk, V - columns are orthonormal bases (dot product of any 2 columns is
zero, unit norm) - Dk - matrix with singular values on diagonal
Truncated SVD is the best rank k approximation of the matrix X in terms of
Frobenius norm:
||X − UkDkV T
k ||F
P = Uk Dk
Q = DkV T
k
Issue with truncated SVD for “explicit” feedback
Optimal in terms of Frobenius norm - takes into account zeros in ratings -
RMSE =
1
users × items u∈users,i∈items
(rui − ˆrui )2
Overfits data
Objective = error only in “observed” ratings:
RMSE =
1
Observed u,i∈Observed
(rui − ˆrui )2
SVD-like matrix factorization with ALS
J =
u,i∈Observed
(rui − pu × qi )2
+ λ(||Q2
|| + ||P2
||)
Given Q fixed solve for p:
min
i∈Observed
(ri − qi × P)2
+ λ
u
j=1
p2
j
Given P fixed solve for q:
min
u∈Observed
(ru − pu × Q)2
+ λ
i
j=1
q2
j
Ridge regression: P = (QT Q + λI)−1QT r, Q = (PT P + λI)−1PT r
“Collaborative Filtering for Implicit Feedback Datasets”
WRMF - Weighted Regularized Matrix Factorization
“Default” approach
Proposed in 2008, but still widely used in industry (even at youtube)
several high-quality open-source implementations
J =
u,i
Cui (Pui − XuYi )2
+ λ(||X||F + ||Y ||F )
Preferences - binary
Pij =
1 if Rij > 0
0 otherwise
Confidence - Cui = 1 + f (Rui )
Alternating Least Squares for implicit feedback
For fixed Y :
dL/dxu = −2
i=item
cui (pui − xT
u yi )yi + 2λxu =
−2
i=item
cui (pui − yT
i xu)yi + 2λxu =
−2Y T
Cu
p(u) + 2Y T
Cu
Yxu + 2λxu
Setting dL/dxu = 0 for optimal solution gives us (Y T CuY + λI)xu = Y T Cup(u)
xu can be obtained by solving system of linear equations:
xu = solve(Y T
Cu
Y + λI, Y T
Cu
p(u))
Alternating Least Squares for implicit feedback
Similarly for fixed X:
dL/dyi = −2XT Ci p(i) + 2XT Ci Yyi + 2λyi
yi = solve(XT Ci X + λI, XT Ci p(i))
Another optimization:
XT Ci X = XT X + XT (Ci − I)X
Y T CuY = Y T Y + Y T (Cu − I)Y
XT X and Y T Y can be precomputed
Accelerated Approximate Alternating Least Squares
yi = solve(XT Ci X + λI, XT Ci p(i))
Iterative methods
Conjugate Gradient
Coordinate Descend
Fixed number of steps of (usually 3-4 is enough):
Inference time
How to make recommendations for new users?
There are no user embeddings since users are not in original matrix!
Inference time
Make one step on ALS with fixed item embeddings matrix => get new user embeddings:
given Y fixed, Cnew - new user-item interactions confidence
xunew = solve(Y T Cunew Y + λI, Y T Cunew p(unew ))
scores = Xnew Y T
WRMF Implementations
python implicit - implemets Conjugate Gradient. With GPU support recently!
R reco - implemets Conjugate Gradient
Spark ALS
Quora qmf
Google tensorflow
*titles are clickable
Linear-Flow
Idea is to learn item-item similarity matrix W from the data.
First
min J = ||X − XWk||F + λ||Wk||F
With constraint:
rank(W ) ≤ k
Linear-Flow observations
1. Whithout L2 regularization optimal solution is Wk = QkQT
k where
SVDk(X) = PkΣkQT
k
2. Whithout rank(W ) ≤ k optimal solution is just solution for ridge regression:
W = (XT X + λI)−1XT X - infeasible.
Linear-Flow reparametrization
SVDk(X) = PkΣkQT
k
Let W = QkY :
argmin(Y ) : ||X − XQkY ||F + λ||QkY ||F
Motivation
λ = 0 => W = QkQT
k and also soliton for current problem Y = QT
k
Linear-Flow closed-form solution
Notice that if Qk orthogogal then ||QkY ||F = ||Y ||F
Solve ||X − XQkY ||F + λ||Y ||F
Simple ridge regression with close form solution
Y = (QT
k XT
XQk + λI)−1
QT
k XT
X
Very cheap inversion of the matrix of rank k!
Linear-Flow hassle-free cross-validation
Y = (QT
k XT
XQk + λI)−1
QT
k XT
X
How to find lamda with cross-validation?
pre-compute Z = QT
k XT X so Y = (ZQk + λI)−1Z -
pre-compute ZQk
notice that value of lambda affects only diagonal of ZQk
generate sequence of lambda (say of length 50) based on min/max diagonal values
solving 50 rigde regression of a small rank is super-fast
Linear-Flow hassle-free cross-validation
Figure 7:
Suggestions
start simple - SVD, WRMF
design proper cross-validation - both objective and data split
think about how to incorporate business logic (for example how to exclude
something)
use single machine implementations
think about inference time
don’t waste time with libraries/articles/blogposts wich demonstrate MF with dense
matrices
Questions?
http://dsnotes.com/tags/recommender-systems/
https://github.com/dselivanov/reco
Contacts:
selivanov.dmitriy@gmail.com
https://github.com/dselivanov
https://www.linkedin.com/in/dselivanov1

More Related Content

What's hot

TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
Simplilearn
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 

What's hot (20)

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Prml07
Prml07Prml07
Prml07
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modeling
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
AI 5 | Local Search
AI 5 | Local SearchAI 5 | Local Search
AI 5 | Local Search
 
Attention in Deep Learning
Attention in Deep LearningAttention in Deep Learning
Attention in Deep Learning
 
Notes on attention mechanism
Notes on attention mechanismNotes on attention mechanism
Notes on attention mechanism
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
System modeling and simulation full notes by sushma shetty (www.vtulife.com)
System modeling and simulation full notes by sushma shetty (www.vtulife.com)System modeling and simulation full notes by sushma shetty (www.vtulife.com)
System modeling and simulation full notes by sushma shetty (www.vtulife.com)
 
Genetic algorithm
Genetic algorithm Genetic algorithm
Genetic algorithm
 

Viewers also liked

「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
Yoshifumi Kawai
 

Viewers also liked (11)

Recsys matrix-factorizations
Recsys matrix-factorizationsRecsys matrix-factorizations
Recsys matrix-factorizations
 
Disorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At ScaleDisorder And Tolerance In Distributed Systems At Scale
Disorder And Tolerance In Distributed Systems At Scale
 
Nelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldNelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional World
 
Finding similar items in high dimensional spaces locality sensitive hashing
Finding similar items in high dimensional spaces  locality sensitive hashingFinding similar items in high dimensional spaces  locality sensitive hashing
Finding similar items in high dimensional spaces locality sensitive hashing
 
Return of the transaction king
Return of the transaction kingReturn of the transaction king
Return of the transaction king
 
Analyzing Functional Programs
Analyzing Functional ProgramsAnalyzing Functional Programs
Analyzing Functional Programs
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)Pythonが動く仕組み(の概要)
Pythonが動く仕組み(の概要)
 
JVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメJVM上で動くPython処理系実装のススメ
JVM上で動くPython処理系実装のススメ
 
機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門機械学習のためのベイズ最適化入門
機械学習のためのベイズ最適化入門
 
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
「黒騎士と白の魔王」gRPCによるHTTP/2 - API, Streamingの実践
 

Similar to Matrix Factorizations for Recommender Systems

Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Université de Liège (ULg)
 

Similar to Matrix Factorizations for Recommender Systems (20)

Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filtering
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid ParallelismDS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
DS-MLR: Scaling Multinomial Logistic Regression via Hybrid Parallelism
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf0A-02-ACA-Fundamentals-Convolution.pdf
0A-02-ACA-Fundamentals-Convolution.pdf
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
Stochastic Frank-Wolfe for Constrained Finite Sum Minimization @ Montreal Opt...
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescent
 
Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...Batch mode reinforcement learning based on the synthesis of artificial trajec...
Batch mode reinforcement learning based on the synthesis of artificial trajec...
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 

Recently uploaded

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 

Recently uploaded (20)

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 

Matrix Factorizations for Recommender Systems

  • 1. Matrix Factorizations for Recommender Systems Dmitriy Selivanov selivanov.dmitriy@gmail.com 2017-11-16
  • 2. Recommender systems are everywhere Figure 1:
  • 3. Recommender systems are everywhere Figure 2:
  • 4. Recommender systems are everywhere Figure 3:
  • 5. Recommender systems are everywhere Figure 4:
  • 6. Goals Propose “relevant” items to customers Retention Exploration Up-sale Personalized offers recommended items for a customer given history of activities (transactions, browsing history, favourites) Similar items substitutions bundles - frequently bought together . . .
  • 7. Live demo Dataset - LastFM-360K: 360k users 160k artists 17M observations sparsity - 0.9999999
  • 8. Explicit feedback Ratings, likes/dislikes, purchases: cleaner data smaller hard to collect RMSE2 = 1 D u,i∈D (rui − ˆrui )2
  • 9. Netflix prize ~ 480k users, 18k movies, 100m ratings sparsity ~ 90% goal is to reduce RMSE by 10% - from 0.9514 to 0.8563
  • 10. Implicit feedback noisy feedback (click, likes, purchases, search, . . . ) much easier to collect wider user/item coverage usually sparsity > 99.9% One-Class Collaborative Filtering observed entries are positive preferences should have high confidence missed entries in matrix are mix of negative preferences and positive preferences consider them as negative with low confidence we cannot really distinguish that user did not click a banner because of a lack of interest or lack of awareness
  • 11. Evaluation Recap: we only care about how to produce small set of highly relevant items. RMSE is bad metrics - very weak connection to business goals. Only interested about relevance precision of retreived items: space on the screen is limited only order matters - most relevant items should be in top
  • 12. Ranking - Mean average precision AveragePrecision = n k=1 (P(k)×rel(k)) number of relevant documents ## index relevant precision_at_k ## 1: 1 0 0.0000000 ## 2: 2 0 0.0000000 ## 3: 3 1 0.3333333 ## 4: 4 0 0.2500000 ## 5: 5 0 0.2000000 map@5 = 0.1566667
  • 13. Ranking - Normalized Discounted Cumulative Gain Intuition is the same as for MAP@K, but also takes into account value of relevance: DCGp = p i=1 2reli − 1 log2(i + 1) nDCGp = DCGp IDCGp IDCGp = |REL| i=1 2reli − 1 log2(i + 1)
  • 14. Approaches Content based good for cold start not personalized Collaborative filtering vanilla collaborative fitlering matrix factorizations . . . Hybrid and context aware recommender systems best of two worlds
  • 15. Focus today WRMF (Weighted Regularized Matrix Factorization) - Collaborative Filtering for Implicit Feedback Datasets (2008) efficient learning with accelerated approximate Alternating Least Squares inference time Linear-FLow - Practical Linear Models for Large-Scale One-Class Collaborative Filtering (2016) efficient truncated SVD cheap cross-validation with full path regularization
  • 16. Matrix Factorizations Users can be described by small number of latent factors puk Items can be described by small number of latent factors qki
  • 18. Low rank matrix factorization R = P × Q factors users items factors
  • 20. Truncated SVD Take k largest singular values: X ≈ UkDkV T k - Xk ∈ Rm∗n - Uk, V - columns are orthonormal bases (dot product of any 2 columns is zero, unit norm) - Dk - matrix with singular values on diagonal Truncated SVD is the best rank k approximation of the matrix X in terms of Frobenius norm: ||X − UkDkV T k ||F P = Uk Dk Q = DkV T k
  • 21. Issue with truncated SVD for “explicit” feedback Optimal in terms of Frobenius norm - takes into account zeros in ratings - RMSE = 1 users × items u∈users,i∈items (rui − ˆrui )2 Overfits data Objective = error only in “observed” ratings: RMSE = 1 Observed u,i∈Observed (rui − ˆrui )2
  • 22. SVD-like matrix factorization with ALS J = u,i∈Observed (rui − pu × qi )2 + λ(||Q2 || + ||P2 ||) Given Q fixed solve for p: min i∈Observed (ri − qi × P)2 + λ u j=1 p2 j Given P fixed solve for q: min u∈Observed (ru − pu × Q)2 + λ i j=1 q2 j Ridge regression: P = (QT Q + λI)−1QT r, Q = (PT P + λI)−1PT r
  • 23. “Collaborative Filtering for Implicit Feedback Datasets” WRMF - Weighted Regularized Matrix Factorization “Default” approach Proposed in 2008, but still widely used in industry (even at youtube) several high-quality open-source implementations J = u,i Cui (Pui − XuYi )2 + λ(||X||F + ||Y ||F ) Preferences - binary Pij = 1 if Rij > 0 0 otherwise Confidence - Cui = 1 + f (Rui )
  • 24. Alternating Least Squares for implicit feedback For fixed Y : dL/dxu = −2 i=item cui (pui − xT u yi )yi + 2λxu = −2 i=item cui (pui − yT i xu)yi + 2λxu = −2Y T Cu p(u) + 2Y T Cu Yxu + 2λxu Setting dL/dxu = 0 for optimal solution gives us (Y T CuY + λI)xu = Y T Cup(u) xu can be obtained by solving system of linear equations: xu = solve(Y T Cu Y + λI, Y T Cu p(u))
  • 25. Alternating Least Squares for implicit feedback Similarly for fixed X: dL/dyi = −2XT Ci p(i) + 2XT Ci Yyi + 2λyi yi = solve(XT Ci X + λI, XT Ci p(i)) Another optimization: XT Ci X = XT X + XT (Ci − I)X Y T CuY = Y T Y + Y T (Cu − I)Y XT X and Y T Y can be precomputed
  • 26. Accelerated Approximate Alternating Least Squares yi = solve(XT Ci X + λI, XT Ci p(i)) Iterative methods Conjugate Gradient Coordinate Descend Fixed number of steps of (usually 3-4 is enough):
  • 27. Inference time How to make recommendations for new users? There are no user embeddings since users are not in original matrix!
  • 28. Inference time Make one step on ALS with fixed item embeddings matrix => get new user embeddings: given Y fixed, Cnew - new user-item interactions confidence xunew = solve(Y T Cunew Y + λI, Y T Cunew p(unew )) scores = Xnew Y T
  • 29. WRMF Implementations python implicit - implemets Conjugate Gradient. With GPU support recently! R reco - implemets Conjugate Gradient Spark ALS Quora qmf Google tensorflow *titles are clickable
  • 30. Linear-Flow Idea is to learn item-item similarity matrix W from the data. First min J = ||X − XWk||F + λ||Wk||F With constraint: rank(W ) ≤ k
  • 31. Linear-Flow observations 1. Whithout L2 regularization optimal solution is Wk = QkQT k where SVDk(X) = PkΣkQT k 2. Whithout rank(W ) ≤ k optimal solution is just solution for ridge regression: W = (XT X + λI)−1XT X - infeasible.
  • 32. Linear-Flow reparametrization SVDk(X) = PkΣkQT k Let W = QkY : argmin(Y ) : ||X − XQkY ||F + λ||QkY ||F Motivation λ = 0 => W = QkQT k and also soliton for current problem Y = QT k
  • 33. Linear-Flow closed-form solution Notice that if Qk orthogogal then ||QkY ||F = ||Y ||F Solve ||X − XQkY ||F + λ||Y ||F Simple ridge regression with close form solution Y = (QT k XT XQk + λI)−1 QT k XT X Very cheap inversion of the matrix of rank k!
  • 34. Linear-Flow hassle-free cross-validation Y = (QT k XT XQk + λI)−1 QT k XT X How to find lamda with cross-validation? pre-compute Z = QT k XT X so Y = (ZQk + λI)−1Z - pre-compute ZQk notice that value of lambda affects only diagonal of ZQk generate sequence of lambda (say of length 50) based on min/max diagonal values solving 50 rigde regression of a small rank is super-fast
  • 36. Suggestions start simple - SVD, WRMF design proper cross-validation - both objective and data split think about how to incorporate business logic (for example how to exclude something) use single machine implementations think about inference time don’t waste time with libraries/articles/blogposts wich demonstrate MF with dense matrices