SlideShare a Scribd company logo
1 of 34
Download to read offline
Large scale
recommendation: a
view from the trenches
Anne-Marie Tousch
Senior Research Scientist
51èmes Journées de Statistiques de la SFdS
Outline
1. Context & problem setting,
2. One large-scale solution,
3. Open problems.
Large scale recommendation: a view from the trenches JDS'19 2 / 26
Context
What Criteo does: online personalized advertising.
Large scale recommendation: a view from the trenches JDS'19 3 / 26
Personalized advertising
Large scale recommendation: a view from the trenches JDS'19 4 / 26
Personalized advertising
We buy ad placements
We recommend
products
We sell clicks that lead to sales.
Large scale recommendation: a view from the trenches JDS'19 5 / 26
Context
Daily: 300B Bid requests; 4B Displays
Worldwide: 3 Billions shoppers; 1 Billion products
Large scale recommendation: a view from the trenches JDS'19 6 / 26
Recommendation
A user = timeline of products browsed.
Large scale recommendation: a view from the trenches JDS'19 7 / 26
Recommendation
A user = timeline of products browsed.
Task: find products she wants to buy
Large scale recommendation: a view from the trenches JDS'19 7 / 26
Recommendation
A user = timeline of products browsed on catalog A.
Task: find products she wants to buy, in catalog B.
Large scale recommendation: a view from the trenches JDS'19 8 / 26
Large-scale high-speed recommendation
4B times a day, recommend products in less than 100ms.
Large scale recommendation: a view from the trenches JDS'19 9 / 26
Large-scale recommender systems
co-event counters, nearest neighbors easy, strong baseline,
matrix factorization (MF) now scales,
neural networks is state-of-the-art, but how does it scale?
Large scale recommendation: a view from the trenches JDS'19 10 / 26
Matrix factorization
Classical recommender system setting:
Products set P, m = |P|; a product: vj, j ∈ [m]
User ui = {vj1 , . . . , vji }
Interaction matrix Ai,j = δ[vj∈ui] or ratings, or counts, A ∈ Rn×m
Factorize A with truncated SVD to obtain user and product embeddings of
dimension k << min(m, n):
A = U · Σ · V∗
Large scale recommendation: a view from the trenches JDS'19 11 / 26
Large-scale MF
What if m ≈ n ≈ 107−9?
Large scale recommendation: a view from the trenches JDS'19 12 / 26
Large-scale MF
What if m ≈ n ≈ 107−9?
Idea: use sketching.
Large scale recommendation: a view from the trenches JDS'19 12 / 26
Large-scale MF
What if m ≈ n ≈ 107−9?
Idea: use sketching.
Johnson-Lindenstrauss lemma, 1984
Let ϵ ∈ (0, 1) and A be a set of n points in Rd . Let k be an integer and
k = O
(
ϵ−2 log n
)
. Then there exists a mapping f : Rd → Rk such that for any a,
b ∈ A:
(1 − ϵ)∥a − b∥2 ≤ ∥f(a) − f(b)∥2 ≤ (1 + ϵ)∥a − b∥2
Large scale recommendation: a view from the trenches JDS'19 12 / 26
Randomized SVD
Large scale recommendation: a view from the trenches JDS'19 13 / 26
Randomized SVD1
Stage A: Compute an approximate basis for the range of the input matrix A. In
other words, we require a matrix Q for which
Q has orthonormal columns and A ≈ QQ∗
A
1
Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. “Finding structure with randomness:
Probabilistic algorithms for constructing approximate matrix decompositions”. In: SIAM review 53.2
(2011), pp. 217–288.
Large scale recommendation: a view from the trenches JDS'19 14 / 26
Randomized SVD1
Stage A: Compute an approximate basis for the range of the input matrix A. In
other words, we require a matrix Q for which
Q has orthonormal columns and A ≈ QQ∗
A
Stage B: Use Q to help compute a standard factorization (QR, SVD, etc.) of A.
Form the matrix B = Q∗
A.
Compute an SVD of the small matrix: U = UΣV∗
Form the orthonormal matrix U = QU
1
Halko, Martinsson, and Tropp, “Finding structure with randomness: Probabilistic algorithms for
constructing approximate matrix decompositions”.
Large scale recommendation: a view from the trenches JDS'19 14 / 26
Randomized SVD
Draw an n × ℓ standard Gaussian matrix Ω.
Form Y0 = AΩ and compute its QR factorization Y0 = Q0R0.
for j = 1, 2, . . . , q
Form Yj = A∗
Qj−1
Compute its QR factorization Yj = QjRj
Form Yj = AQj
Compute its QR factorization Yj = QjRj
Q = Qq
Apply stage B:
B := QTA; BT = ˜QR = ˜Q
(
ˆVSˆUT
)
U := QˆU
Large scale recommendation: a view from the trenches JDS'19 15 / 26
Randomized decomposition
Draw an n × ℓ standard Gaussian matrix Ω.
Form Y0 = AΩ and compute its QR factorization Y0 = Q0R0.
for j = 1, 2, . . . , q
Normalize rows of Qj−1
Form ˜Yj = A∗
Qj−1
Compute its QR factorization Yj = QjRj
Normalize rows of Qj
Form Yj = AQj
Compute its QR factorization Yj = QjRj
Q = Qq
Skip stage B.
Large scale recommendation: a view from the trenches JDS'19 16 / 26
Matrix factorization vs. Word2Vec
“For a negative-sampling value of k = 1, the Skip-Gram objective is factorizing a
word-context matrix in which the association between a word and its context is
measured by f(w, c) = PMI(w, c)2
.”
We approximate Skip-Gram by factorizing a PMI matrix with:
P = A∗
A ∈ Rm×m
PMIi,j := log
Pi,j
∑
i′,j′ Pi′,j′
∑
j′ Pi,j′
∑
i′ Pj,i′
2
Omer Levy and Yoav Goldberg. “Neural word embedding as implicit matrix factorization”. In:
Advances in neural information processing systems. 2014, pp. 2177–2185.
Large scale recommendation: a view from the trenches JDS'19 17 / 26
Approximate nearest neighbors
Project user in embedding space,
Recommend top-k nearest neighbors to user in product space.
Problem: if different catalogs are not aligned, nearest neighbors are almost
always the same.
Large scale recommendation: a view from the trenches JDS'19 18 / 26
Open questions
Pb1: the popularity biases
Eg: Recommending high-frequency items is a strong baseline strategy.
=> fairness and diversity issues.
Large scale recommendation: a view from the trenches JDS'19 19 / 26
Open questions
Pb1: the popularity biases
Eg: Recommending high-frequency items is a strong baseline strategy.
=> fairness and diversity issues.
high-frequency users, big vs. small advertisers, ...
Large scale recommendation: a view from the trenches JDS'19 19 / 26
Open questions
Pb2: the organic traffic bias
Metric: predict next item?
Large scale recommendation: a view from the trenches JDS'19 20 / 26
Open questions
Pb2: the organic traffic bias
Metric: predict next item?
Large scale recommendation: a view from the trenches JDS'19 20 / 26
Open questions
Pb2: the organic traffic bias
Metric: predict next item?
But: we want to predict incremental sales. What if we had not recommended
this product, would the user still have bought it?
3
Stephen Bonner and Flavian Vasile. “Causal embeddings for recommendation”. In: Proceedings of
the 12th ACM Conference on Recommender Systems. ACM. 2018, pp. 104–112.
Large scale recommendation: a view from the trenches JDS'19 21 / 26
Open questions
Pb2: the organic traffic bias
Metric: predict next item?
But: we want to predict incremental sales. What if we had not recommended
this product, would the user still have bought it?
Idea: learn embeddings to optimize individual treatment effects3.
3
Bonner and Vasile, “Causal embeddings for recommendation”.
Large scale recommendation: a view from the trenches JDS'19 21 / 26
Open questions
Pb2: the organic traffic bias
Simulation environment4: https://github.com/criteo-research/reco-gym
4
David Rohde et al. “RecoGym: A Reinforcement Learning Environment for the problem of Product
Recommendation in Online Advertising”. In: arXiv preprint arXiv:1808.00720 (2018).
Large scale recommendation: a view from the trenches JDS'19 22 / 26
Open questions
Pb3: the unbounded number of products
Large scale neural networks: variational auto-encoder example
“[Use...] function fθ(·) ∈ RI
to produce a probability distribution
over m items π (zu) ...a
.”
What if I = 107, 109?
a
Dawen Liang et al. “Variational autoencoders for collaborative filtering”. In: Proceedings of t
2018 World Wide Web Conference on World Wide Web. International World Wide Web
Conferences Steering Committee. 2018, pp. 689–698.
Large scale recommendation: a view from the trenches JDS'19 23 / 26
Open questions
Pb3: the unbounded number of products
Idea: use group testing scheme with binary p × m matrix H
h(y) = H ∨ y
=> work as with p pseudo-items.
“Theorem: Suppose we wish to recover a k sparse binary vector y ∈ Rm
. A random
binary {0, 1} matrix A where each entry is 1 with probability ρ = 1/k recovers 1 − ε
proportion of the support of y correctly with high probability, for any ε > 0, with
p = O(k log m). This matrix will also detect e = Ω(p) errors.5
”
5
Shashanka Ubaru and Arya Mazumdar. “Multilabel classification with group testing and codes”.
In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org.
2017, pp. 3492–3501.
Large scale recommendation: a view from the trenches JDS'19 24 / 26
Open questions
Pb3: the unbounded number of products
Idea: use group testing scheme with binary p × m matrix H
h(y) = H ∨ y
=> work as with p pseudo-items.
“Theorem: Suppose we wish to recover a k sparse binary vector y ∈ Rm
. A random
binary {0, 1} matrix A where each entry is 1 with probability ρ = 1/k recovers 1 − ε
proportion of the support of y correctly with high probability, for any ε > 0, with
p = O(k log m). This matrix will also detect e = Ω(p) errors.5
”
Question: Can we do better knowing the item frequency follows a power law?
5
Ubaru and Mazumdar, “Multilabel classification with group testing and codes”.
Large scale recommendation: a view from the trenches JDS'19 24 / 26
Thanks! Questions?
Reach out to me at:
am.tousch@criteo.com or on Twitter @amy8492
Large scale recommendation: a view from the trenches JDS'19 25 / 26
Bonner, Stephen and Flavian Vasile. “Causal embeddings for recommendation”. In:
Proceedings of the 12th ACM Conference on Recommender Systems. ACM. 2018,
pp. 104–112.
Halko, Nathan, Per-Gunnar Martinsson, and Joel A Tropp. “Finding structure with
randomness: Probabilistic algorithms for constructing approximate matrix
decompositions”. In: SIAM review 53.2 (2011), pp. 217–288.
Levy, Omer and Yoav Goldberg. “Neural word embedding as implicit matrix
factorization”. In: Advances in neural information processing systems. 2014,
pp. 2177–2185.
Liang, Dawen et al. “Variational autoencoders for collaborative filtering”. In:
Proceedings of the 2018 World Wide Web Conference on World Wide Web.
International World Wide Web Conferences Steering Committee. 2018, pp. 689–698.
Rohde, David et al. “RecoGym: A Reinforcement Learning Environment for the
problem of Product Recommendation in Online Advertising”. In: arXiv preprint
arXiv:1808.00720 (2018).
Ubaru, Shashanka and Arya Mazumdar. “Multilabel classification with group testing
and codes”. In: Proceedings of the 34th International Conference on Machine
Learning-Volume 70. JMLR. org. 2017, pp. 3492–3501.
Large scale recommendation: a view from the trenches JDS'19 26 / 26

More Related Content

What's hot

Brain reading, compressive sensing, fMRI and statistical learning in Python
Brain reading, compressive sensing, fMRI and statistical learning in PythonBrain reading, compressive sensing, fMRI and statistical learning in Python
Brain reading, compressive sensing, fMRI and statistical learning in PythonGael Varoquaux
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksAnmol Dwivedi
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningMark Chang
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Kostas Hatalis, PhD
 
Brain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizationsBrain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizationsGael Varoquaux
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
Social-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsitySocial-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsityGael Varoquaux
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain AdaptationMark Chang
 
LupoPasini_SIAMCSE15
LupoPasini_SIAMCSE15LupoPasini_SIAMCSE15
LupoPasini_SIAMCSE15Karen Pao
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesisValentin De Bortoli
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesDmitrii Ignatov
 

What's hot (20)

Brain reading, compressive sensing, fMRI and statistical learning in Python
Brain reading, compressive sensing, fMRI and statistical learning in PythonBrain reading, compressive sensing, fMRI and statistical learning in Python
Brain reading, compressive sensing, fMRI and statistical learning in Python
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian Networks
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep Learning
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
Constrained Support Vector Quantile Regression for Conditional Quantile Estim...
 
CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
Brain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizationsBrain maps from machine learning? Spatial regularizations
Brain maps from machine learning? Spatial regularizations
 
Linear models2
Linear models2Linear models2
Linear models2
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
Social-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsitySocial-sparsity brain decoders: faster spatial sparsity
Social-sparsity brain decoders: faster spatial sparsity
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain Adaptation
 
Gtti 10032021
Gtti 10032021Gtti 10032021
Gtti 10032021
 
LupoPasini_SIAMCSE15
LupoPasini_SIAMCSE15LupoPasini_SIAMCSE15
LupoPasini_SIAMCSE15
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequences
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 

Similar to Large Scale Recommendation: a view from the Trenches

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesMohammad
 
Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Nicky Grant
 
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...Daniel Valcarce
 
Free Ebooks Download
Free Ebooks Download Free Ebooks Download
Free Ebooks Download Edhole.com
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?Dhafer Malouche
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3arogozhnikov
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 

Similar to Large Scale Recommendation: a view from the Trenches (20)

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
 
Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Thesis_NickyGrant_2013
Thesis_NickyGrant_2013
 
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
 
Free Ebooks Download
Free Ebooks Download Free Ebooks Download
Free Ebooks Download
 
Talk iccf 19_ben_hammouda
Talk iccf 19_ben_hammoudaTalk iccf 19_ben_hammouda
Talk iccf 19_ben_hammouda
 
AggNet: Deep Learning from Crowds
AggNet: Deep Learning from CrowdsAggNet: Deep Learning from Crowds
AggNet: Deep Learning from Crowds
 
(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim(141205) Masters_Thesis_Defense_Sundong_Kim
(141205) Masters_Thesis_Defense_Sundong_Kim
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
Data Mining.ppt
Data Mining.pptData Mining.ppt
Data Mining.ppt
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
ilp-nlp-slides.pdf
ilp-nlp-slides.pdfilp-nlp-slides.pdf
ilp-nlp-slides.pdf
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 

More from Anne-Marie Tousch

Large-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewLarge-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewAnne-Marie Tousch
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning ReadinessAnne-Marie Tousch
 
Data is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itData is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itAnne-Marie Tousch
 
PyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingPyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingAnne-Marie Tousch
 

More from Anne-Marie Tousch (6)

Large-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewLarge-scale recommendation, a random point of view
Large-scale recommendation, a random point of view
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning Readiness
 
Data is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itData is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin it
 
PyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingPyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programming
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 

Large Scale Recommendation: a view from the Trenches

  • 1. Large scale recommendation: a view from the trenches Anne-Marie Tousch Senior Research Scientist 51èmes Journées de Statistiques de la SFdS
  • 2. Outline 1. Context & problem setting, 2. One large-scale solution, 3. Open problems. Large scale recommendation: a view from the trenches JDS'19 2 / 26
  • 3. Context What Criteo does: online personalized advertising. Large scale recommendation: a view from the trenches JDS'19 3 / 26
  • 4. Personalized advertising Large scale recommendation: a view from the trenches JDS'19 4 / 26
  • 5. Personalized advertising We buy ad placements We recommend products We sell clicks that lead to sales. Large scale recommendation: a view from the trenches JDS'19 5 / 26
  • 6. Context Daily: 300B Bid requests; 4B Displays Worldwide: 3 Billions shoppers; 1 Billion products Large scale recommendation: a view from the trenches JDS'19 6 / 26
  • 7. Recommendation A user = timeline of products browsed. Large scale recommendation: a view from the trenches JDS'19 7 / 26
  • 8. Recommendation A user = timeline of products browsed. Task: find products she wants to buy Large scale recommendation: a view from the trenches JDS'19 7 / 26
  • 9. Recommendation A user = timeline of products browsed on catalog A. Task: find products she wants to buy, in catalog B. Large scale recommendation: a view from the trenches JDS'19 8 / 26
  • 10. Large-scale high-speed recommendation 4B times a day, recommend products in less than 100ms. Large scale recommendation: a view from the trenches JDS'19 9 / 26
  • 11. Large-scale recommender systems co-event counters, nearest neighbors easy, strong baseline, matrix factorization (MF) now scales, neural networks is state-of-the-art, but how does it scale? Large scale recommendation: a view from the trenches JDS'19 10 / 26
  • 12. Matrix factorization Classical recommender system setting: Products set P, m = |P|; a product: vj, j ∈ [m] User ui = {vj1 , . . . , vji } Interaction matrix Ai,j = δ[vj∈ui] or ratings, or counts, A ∈ Rn×m Factorize A with truncated SVD to obtain user and product embeddings of dimension k << min(m, n): A = U · Σ · V∗ Large scale recommendation: a view from the trenches JDS'19 11 / 26
  • 13. Large-scale MF What if m ≈ n ≈ 107−9? Large scale recommendation: a view from the trenches JDS'19 12 / 26
  • 14. Large-scale MF What if m ≈ n ≈ 107−9? Idea: use sketching. Large scale recommendation: a view from the trenches JDS'19 12 / 26
  • 15. Large-scale MF What if m ≈ n ≈ 107−9? Idea: use sketching. Johnson-Lindenstrauss lemma, 1984 Let ϵ ∈ (0, 1) and A be a set of n points in Rd . Let k be an integer and k = O ( ϵ−2 log n ) . Then there exists a mapping f : Rd → Rk such that for any a, b ∈ A: (1 − ϵ)∥a − b∥2 ≤ ∥f(a) − f(b)∥2 ≤ (1 + ϵ)∥a − b∥2 Large scale recommendation: a view from the trenches JDS'19 12 / 26
  • 16. Randomized SVD Large scale recommendation: a view from the trenches JDS'19 13 / 26
  • 17. Randomized SVD1 Stage A: Compute an approximate basis for the range of the input matrix A. In other words, we require a matrix Q for which Q has orthonormal columns and A ≈ QQ∗ A 1 Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions”. In: SIAM review 53.2 (2011), pp. 217–288. Large scale recommendation: a view from the trenches JDS'19 14 / 26
  • 18. Randomized SVD1 Stage A: Compute an approximate basis for the range of the input matrix A. In other words, we require a matrix Q for which Q has orthonormal columns and A ≈ QQ∗ A Stage B: Use Q to help compute a standard factorization (QR, SVD, etc.) of A. Form the matrix B = Q∗ A. Compute an SVD of the small matrix: U = UΣV∗ Form the orthonormal matrix U = QU 1 Halko, Martinsson, and Tropp, “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions”. Large scale recommendation: a view from the trenches JDS'19 14 / 26
  • 19. Randomized SVD Draw an n × ℓ standard Gaussian matrix Ω. Form Y0 = AΩ and compute its QR factorization Y0 = Q0R0. for j = 1, 2, . . . , q Form Yj = A∗ Qj−1 Compute its QR factorization Yj = QjRj Form Yj = AQj Compute its QR factorization Yj = QjRj Q = Qq Apply stage B: B := QTA; BT = ˜QR = ˜Q ( ˆVSˆUT ) U := QˆU Large scale recommendation: a view from the trenches JDS'19 15 / 26
  • 20. Randomized decomposition Draw an n × ℓ standard Gaussian matrix Ω. Form Y0 = AΩ and compute its QR factorization Y0 = Q0R0. for j = 1, 2, . . . , q Normalize rows of Qj−1 Form ˜Yj = A∗ Qj−1 Compute its QR factorization Yj = QjRj Normalize rows of Qj Form Yj = AQj Compute its QR factorization Yj = QjRj Q = Qq Skip stage B. Large scale recommendation: a view from the trenches JDS'19 16 / 26
  • 21. Matrix factorization vs. Word2Vec “For a negative-sampling value of k = 1, the Skip-Gram objective is factorizing a word-context matrix in which the association between a word and its context is measured by f(w, c) = PMI(w, c)2 .” We approximate Skip-Gram by factorizing a PMI matrix with: P = A∗ A ∈ Rm×m PMIi,j := log Pi,j ∑ i′,j′ Pi′,j′ ∑ j′ Pi,j′ ∑ i′ Pj,i′ 2 Omer Levy and Yoav Goldberg. “Neural word embedding as implicit matrix factorization”. In: Advances in neural information processing systems. 2014, pp. 2177–2185. Large scale recommendation: a view from the trenches JDS'19 17 / 26
  • 22. Approximate nearest neighbors Project user in embedding space, Recommend top-k nearest neighbors to user in product space. Problem: if different catalogs are not aligned, nearest neighbors are almost always the same. Large scale recommendation: a view from the trenches JDS'19 18 / 26
  • 23. Open questions Pb1: the popularity biases Eg: Recommending high-frequency items is a strong baseline strategy. => fairness and diversity issues. Large scale recommendation: a view from the trenches JDS'19 19 / 26
  • 24. Open questions Pb1: the popularity biases Eg: Recommending high-frequency items is a strong baseline strategy. => fairness and diversity issues. high-frequency users, big vs. small advertisers, ... Large scale recommendation: a view from the trenches JDS'19 19 / 26
  • 25. Open questions Pb2: the organic traffic bias Metric: predict next item? Large scale recommendation: a view from the trenches JDS'19 20 / 26
  • 26. Open questions Pb2: the organic traffic bias Metric: predict next item? Large scale recommendation: a view from the trenches JDS'19 20 / 26
  • 27. Open questions Pb2: the organic traffic bias Metric: predict next item? But: we want to predict incremental sales. What if we had not recommended this product, would the user still have bought it? 3 Stephen Bonner and Flavian Vasile. “Causal embeddings for recommendation”. In: Proceedings of the 12th ACM Conference on Recommender Systems. ACM. 2018, pp. 104–112. Large scale recommendation: a view from the trenches JDS'19 21 / 26
  • 28. Open questions Pb2: the organic traffic bias Metric: predict next item? But: we want to predict incremental sales. What if we had not recommended this product, would the user still have bought it? Idea: learn embeddings to optimize individual treatment effects3. 3 Bonner and Vasile, “Causal embeddings for recommendation”. Large scale recommendation: a view from the trenches JDS'19 21 / 26
  • 29. Open questions Pb2: the organic traffic bias Simulation environment4: https://github.com/criteo-research/reco-gym 4 David Rohde et al. “RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising”. In: arXiv preprint arXiv:1808.00720 (2018). Large scale recommendation: a view from the trenches JDS'19 22 / 26
  • 30. Open questions Pb3: the unbounded number of products Large scale neural networks: variational auto-encoder example “[Use...] function fθ(·) ∈ RI to produce a probability distribution over m items π (zu) ...a .” What if I = 107, 109? a Dawen Liang et al. “Variational autoencoders for collaborative filtering”. In: Proceedings of t 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 2018, pp. 689–698. Large scale recommendation: a view from the trenches JDS'19 23 / 26
  • 31. Open questions Pb3: the unbounded number of products Idea: use group testing scheme with binary p × m matrix H h(y) = H ∨ y => work as with p pseudo-items. “Theorem: Suppose we wish to recover a k sparse binary vector y ∈ Rm . A random binary {0, 1} matrix A where each entry is 1 with probability ρ = 1/k recovers 1 − ε proportion of the support of y correctly with high probability, for any ε > 0, with p = O(k log m). This matrix will also detect e = Ω(p) errors.5 ” 5 Shashanka Ubaru and Arya Mazumdar. “Multilabel classification with group testing and codes”. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org. 2017, pp. 3492–3501. Large scale recommendation: a view from the trenches JDS'19 24 / 26
  • 32. Open questions Pb3: the unbounded number of products Idea: use group testing scheme with binary p × m matrix H h(y) = H ∨ y => work as with p pseudo-items. “Theorem: Suppose we wish to recover a k sparse binary vector y ∈ Rm . A random binary {0, 1} matrix A where each entry is 1 with probability ρ = 1/k recovers 1 − ε proportion of the support of y correctly with high probability, for any ε > 0, with p = O(k log m). This matrix will also detect e = Ω(p) errors.5 ” Question: Can we do better knowing the item frequency follows a power law? 5 Ubaru and Mazumdar, “Multilabel classification with group testing and codes”. Large scale recommendation: a view from the trenches JDS'19 24 / 26
  • 33. Thanks! Questions? Reach out to me at: am.tousch@criteo.com or on Twitter @amy8492 Large scale recommendation: a view from the trenches JDS'19 25 / 26
  • 34. Bonner, Stephen and Flavian Vasile. “Causal embeddings for recommendation”. In: Proceedings of the 12th ACM Conference on Recommender Systems. ACM. 2018, pp. 104–112. Halko, Nathan, Per-Gunnar Martinsson, and Joel A Tropp. “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions”. In: SIAM review 53.2 (2011), pp. 217–288. Levy, Omer and Yoav Goldberg. “Neural word embedding as implicit matrix factorization”. In: Advances in neural information processing systems. 2014, pp. 2177–2185. Liang, Dawen et al. “Variational autoencoders for collaborative filtering”. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 2018, pp. 689–698. Rohde, David et al. “RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising”. In: arXiv preprint arXiv:1808.00720 (2018). Ubaru, Shashanka and Arya Mazumdar. “Multilabel classification with group testing and codes”. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org. 2017, pp. 3492–3501. Large scale recommendation: a view from the trenches JDS'19 26 / 26