SlideShare a Scribd company logo
Factorization Machines
- Introduction
Bartłomiej Twardowski
18.10.2016
Warsaw Data Science Meetup
Polish English?
• Support Vector Machines
=> “maszyna wektorów
nośnych”
• Matrix Factorization =>
“faktoryzacja macierzy”
• Factorization Machines =>
“maszyna faktoryzująca”?
• LMGTFY:-) Let’s stick to the
English name then!
Motivation
• one of the most successful model with a great of
expressiveness
• great for begin with context-aware recommendations
• considered as base toolbox for advertisers/kagglers
• FFM presentation from many years ago was on
RecSys 2016 ( still, almost nothing new in it :-( )
• considered it as a fun and original subject for meetup
2015.10.6 - meetup about recommender systems
Not motivated enough?
Success stories.
2. more often appears in DS job offers
1. competitions
Factorization Machines
• S. Rendle 2010 [1]
• combines advantages os Support Vector
Machines(SVM) with factorization models
• generic (real-value features)
• incredible good for sparse data
• model expressiveness
MF - quick recap
Simplest problem formulation[3]:
• U - user set, I - item set
• matrix contains user ratings
• find the best representation in k dimensional latent space for
user P (|U| × k) and items Q (|I| × k) so the matrix Rˆ is defined as: 

• to predict rating:
R 2 R|U|⇥|I|
MF - quick recap
with regularization[4]:
Linear & Poly2 models
ˆy(x) = w0 +
nX
i=1
wixi +
nX
i=1
nX
j=i+1
vi,jxixj
ˆy(x) = w0 +
nX
i=1
wixi
simple linear regression model:
adding two-way interactions:
FM Model
for two-way interactions:
model parameters:
For each xi we have dedicated vector vi with k-features.
Then instead of weight wij for feature interactions we have
dot product:
Wait, it’s O(kn2
)! Not linear!
Making it O(kn)
2
6
6
6
6
6
4
x11 x12 x13 . . . x1n
x21 x22 x23 . . . x2n
x31 x32 x33 . . . x3n
...
...
...
...
...
xd1 xd2 xd3 . . . xdn
3
7
7
7
7
7
5
Simplified version
for k = 1, n =2 perspective
(a + b)2
= a2
+ 2ab + b2
ab =
1
2
(a + b)2
a2
b2
let:
v1x1 = a, v2x2 = b
then:
And now it looks very familiar :-)
FM vs SVM
• FM combines the advantages of SVM and factorization
models
• general prediction working on real-values (like SVM)
• good estimates interactions model with huge sparsity,
where SVM fail (e.g. recommender systems)
• model equation of FMs can be calculated in linear time
• comparable to a polynomial kernel in SVM, but works
for very spars data and works fast.
Use case: Context-Aware
Recommender Systems
• U = {Alice (A),Bob (B),Charlie (C), . . .}
• I = {Titanic (TI),Notting Hill (NH), Star Wars (SW),
Star Trek (ST), . . .}
• S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW,
2010-4, 1),(B, SW, 2009-5, 4), (B, ST, 2009-8, 5),
(C,TI, 2009-9, 1), (C, SW, 2009-12, 5)}
• Example from [1]
Example of input data
preparation
Why us FM for this?
The drawback of tensor factorization models and
even more for specialized factorization models is
that [1]:
(1) they are not applicable to standard prediction
data (e.g. a real valued feature vector)
(2) that specialized models are usually derived
individually for a specific task requiring effort in
modeling and design of a learning algorithm.
How about ranking?
Go for pairwise approach!
http://www.tongji.edu.cn/~qiliu/lor_vs.html
Model expressiveness
FM ~ MF
given
the model will then mimic a biased MF:
MF ~ PITF
given user x item x tag interactions as:
FM will mimic a pairwise interaction
tensor factorization model (PITF) [7]:
And others
(e.g. factorized NN, KNN++, SVD++, …)
presented in [2].
Field-aware FM
• Have been used to win two CTR competitions [5].
• Introducing grouped features - fields, eg. user,
color, time.
• Learn a different set of latent factors for every pair
of fields
where f(i) is the field of a feature i.
ˆy(x) = w0 +
nX
i=1
wixi +
nX
i=1
nX
j=i+1
hvi,f(j), vj,f(i)ixixj
Available implementations
• libfm (http://www.libfm.org/), SGD/ALS/MCMC
• FM for Julia (https://github.com/btwardow/
FactorizationMachines.jl)
• fastFM (https://github.com/ibayer/fastFM)
• DiFacto (https://github.com/dmlc/difacto)
• lightfm
• spark-libFM, libffm
My experiments with FM on GPU
The same implementation moved from numpy to Theano was
~7x faster! Without using any special GPU tricks.
Going for click prediction?
• feature engineering (counting features, like hist. ctr)
• hashing trick
• L1, FTRL using e.g. vw
• making new features - e.g. decision tree encoding
How about now? :-)
References
[1] Rendle, Steffen. "Factorization machines." 2010 IEEE International
Conference on Data Mining. IEEE, 2010.
[2] Rendle, Steffen. "Factorization machines with libfm." ACM
Transactions on Intelligent Systems and Technology (TIST) 3.3 (2012): 57.
[3] Takács, Gábor, et al. "Matrix factorization and neighbor based
algorithms for the netflix prize problem." Proceedings of the 2008 ACM
conference on Recommender systems. ACM, 2008.
[4] Paterek, Arkadiusz. "Improving regularized singular value
decomposition for collaborative filtering." Proceedings of KDD cup and
workshop. Vol. 2007. 2007.
References
[5] http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf
[6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005.
Maximum-margin matrix factorization. In Advances in Neural
Information Processing Systems 17,MIT 1329–1336.
[7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction
tensor factorization for personalized tag recommendation. In
Proceedings of the third ACM International Conference on Web
Search and Data Mining (WSDM’10). ACM, New York, NY, 81–90.
Q&A
@btwardow, Bartłomiej Twardowski
B.Twardowski@ii.pw.edu.pl

More Related Content

What's hot

Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation Systems
Axel de Romblay
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
YONG ZHENG
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
Chris Johnson
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
Justin Basilico
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
YONG ZHENG
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck
 
Session-Based Recommender Systems
Session-Based Recommender SystemsSession-Based Recommender Systems
Session-Based Recommender Systems
Eötvös Loránd University
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
YONG ZHENG
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Massimo Quadrana
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
Oguz Semerci
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
Parmeshwar Khurd
 
ML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive AnalyticsML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive Analytics
Erik Bernhardsson
 

What's hot (20)

Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation Systems
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Learning a Personalized Homepage
Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Session-Based Recommender Systems
Session-Based Recommender SystemsSession-Based Recommender Systems
Session-Based Recommender Systems
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
ML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive AnalyticsML+Hadoop at NYC Predictive Analytics
ML+Hadoop at NYC Predictive Analytics
 

Viewers also liked

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
MLconf
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
Bartlomiej Twardowski
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Bartlomiej Twardowski
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Bartlomiej Twardowski
 
Факторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системахФакторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системах
romovpa
 
allegrotech - Data science meetup #1 Intro
allegrotech - Data science  meetup #1 Introallegrotech - Data science  meetup #1 Intro
allegrotech - Data science meetup #1 Intro
Bartlomiej Twardowski
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
DKALab
 
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-PieRecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
Tommaso Carpi
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
DKALab
 
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Bartlomiej Twardowski
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
David Zibriczky
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Vasily Leksin
 
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big DataPrezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Bartlomiej Twardowski
 
Building a Predictive Model
Building a Predictive ModelBuilding a Predictive Model
Building a Predictive Model
DKALab
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
DKALab
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Milind Gokhale
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
Vikrant Arya
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 

Viewers also liked (20)

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
Факторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системахФакторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системах
 
allegrotech - Data science meetup #1 Intro
allegrotech - Data science  meetup #1 Introallegrotech - Data science  meetup #1 Intro
allegrotech - Data science meetup #1 Intro
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-PieRecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
 
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
 
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big DataPrezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
 
Building a Predictive Model
Building a Predictive ModelBuilding a Predictive Model
Building a Predictive Model
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Similar to Warsaw Data Science - Factorization Machines Introduction

Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
ssuserff72e4
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in Julia
GapData Institute
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305
Amazon Web Services
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
Aapo Kyrölä
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
Ganesan Narayanasamy
 
Новый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныНовый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоны
Timur Safin
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
Travis Oliphant
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
Fwdays
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
Barry Jones
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
StampedeCon
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
Oswald Campesato
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
Julien SIMON
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Databricks
 
Making fitting in RooFit faster
Making fitting in RooFit fasterMaking fitting in RooFit faster
Making fitting in RooFit faster
Patrick Bos
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
kabirmahlotra
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
kabirmahlotra
 

Similar to Warsaw Data Science - Factorization Machines Introduction (20)

Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in Julia
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
 
Новый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныНовый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоны
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Clojure intro
Clojure introClojure intro
Clojure intro
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 
Making fitting in RooFit faster
Making fitting in RooFit fasterMaking fitting in RooFit faster
Making fitting in RooFit faster
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
 

Recently uploaded

Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
azzyixes
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
yusufzako14
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 

Recently uploaded (20)

Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
plant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptxplant biotechnology Lecture note ppt.pptx
plant biotechnology Lecture note ppt.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 

Warsaw Data Science - Factorization Machines Introduction

  • 1. Factorization Machines - Introduction Bartłomiej Twardowski 18.10.2016 Warsaw Data Science Meetup
  • 2. Polish English? • Support Vector Machines => “maszyna wektorów nośnych” • Matrix Factorization => “faktoryzacja macierzy” • Factorization Machines => “maszyna faktoryzująca”? • LMGTFY:-) Let’s stick to the English name then!
  • 3. Motivation • one of the most successful model with a great of expressiveness • great for begin with context-aware recommendations • considered as base toolbox for advertisers/kagglers • FFM presentation from many years ago was on RecSys 2016 ( still, almost nothing new in it :-( ) • considered it as a fun and original subject for meetup
  • 4. 2015.10.6 - meetup about recommender systems
  • 5. Not motivated enough? Success stories. 2. more often appears in DS job offers 1. competitions
  • 6. Factorization Machines • S. Rendle 2010 [1] • combines advantages os Support Vector Machines(SVM) with factorization models • generic (real-value features) • incredible good for sparse data • model expressiveness
  • 7. MF - quick recap Simplest problem formulation[3]: • U - user set, I - item set • matrix contains user ratings • find the best representation in k dimensional latent space for user P (|U| × k) and items Q (|I| × k) so the matrix Rˆ is defined as: 
 • to predict rating: R 2 R|U|⇥|I|
  • 8. MF - quick recap with regularization[4]:
  • 9. Linear & Poly2 models ˆy(x) = w0 + nX i=1 wixi + nX i=1 nX j=i+1 vi,jxixj ˆy(x) = w0 + nX i=1 wixi simple linear regression model: adding two-way interactions:
  • 10. FM Model for two-way interactions: model parameters: For each xi we have dedicated vector vi with k-features. Then instead of weight wij for feature interactions we have dot product:
  • 11. Wait, it’s O(kn2 )! Not linear!
  • 12. Making it O(kn) 2 6 6 6 6 6 4 x11 x12 x13 . . . x1n x21 x22 x23 . . . x2n x31 x32 x33 . . . x3n ... ... ... ... ... xd1 xd2 xd3 . . . xdn 3 7 7 7 7 7 5
  • 13. Simplified version for k = 1, n =2 perspective (a + b)2 = a2 + 2ab + b2 ab = 1 2 (a + b)2 a2 b2 let: v1x1 = a, v2x2 = b then: And now it looks very familiar :-)
  • 14. FM vs SVM • FM combines the advantages of SVM and factorization models • general prediction working on real-values (like SVM) • good estimates interactions model with huge sparsity, where SVM fail (e.g. recommender systems) • model equation of FMs can be calculated in linear time • comparable to a polynomial kernel in SVM, but works for very spars data and works fast.
  • 15. Use case: Context-Aware Recommender Systems • U = {Alice (A),Bob (B),Charlie (C), . . .} • I = {Titanic (TI),Notting Hill (NH), Star Wars (SW), Star Trek (ST), . . .} • S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW, 2010-4, 1),(B, SW, 2009-5, 4), (B, ST, 2009-8, 5), (C,TI, 2009-9, 1), (C, SW, 2009-12, 5)} • Example from [1]
  • 16. Example of input data preparation
  • 17. Why us FM for this? The drawback of tensor factorization models and even more for specialized factorization models is that [1]: (1) they are not applicable to standard prediction data (e.g. a real valued feature vector) (2) that specialized models are usually derived individually for a specific task requiring effort in modeling and design of a learning algorithm.
  • 18. How about ranking? Go for pairwise approach! http://www.tongji.edu.cn/~qiliu/lor_vs.html
  • 20. FM ~ MF given the model will then mimic a biased MF:
  • 21. MF ~ PITF given user x item x tag interactions as: FM will mimic a pairwise interaction tensor factorization model (PITF) [7]:
  • 22. And others (e.g. factorized NN, KNN++, SVD++, …) presented in [2].
  • 23. Field-aware FM • Have been used to win two CTR competitions [5]. • Introducing grouped features - fields, eg. user, color, time. • Learn a different set of latent factors for every pair of fields where f(i) is the field of a feature i. ˆy(x) = w0 + nX i=1 wixi + nX i=1 nX j=i+1 hvi,f(j), vj,f(i)ixixj
  • 24. Available implementations • libfm (http://www.libfm.org/), SGD/ALS/MCMC • FM for Julia (https://github.com/btwardow/ FactorizationMachines.jl) • fastFM (https://github.com/ibayer/fastFM) • DiFacto (https://github.com/dmlc/difacto) • lightfm • spark-libFM, libffm
  • 25. My experiments with FM on GPU The same implementation moved from numpy to Theano was ~7x faster! Without using any special GPU tricks.
  • 26. Going for click prediction? • feature engineering (counting features, like hist. ctr) • hashing trick • L1, FTRL using e.g. vw • making new features - e.g. decision tree encoding
  • 28. References [1] Rendle, Steffen. "Factorization machines." 2010 IEEE International Conference on Data Mining. IEEE, 2010. [2] Rendle, Steffen. "Factorization machines with libfm." ACM Transactions on Intelligent Systems and Technology (TIST) 3.3 (2012): 57. [3] Takács, Gábor, et al. "Matrix factorization and neighbor based algorithms for the netflix prize problem." Proceedings of the 2008 ACM conference on Recommender systems. ACM, 2008. [4] Paterek, Arkadiusz. "Improving regularized singular value decomposition for collaborative filtering." Proceedings of KDD cup and workshop. Vol. 2007. 2007.
  • 29. References [5] http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf [6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems 17,MIT 1329–1336. [7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the third ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, NY, 81–90.