SlideShare a Scribd company logo
1 of 18
Tensorized DPP for Basket
Completion
September, 27th 2018Romain WARLOP
With Jérémie Mary (Criteo) and Mike Gartrell (Criteo)
The objective of basket completion is to suggest to a user one or several items
according to items already in her cart
Associative Classifier DPP
Definition (Confidence rule)
𝑐𝑜𝑛𝑓 𝐴 → 𝐵 =
𝑠𝑢𝑝𝑝(𝐴 ∪ 𝐵)
𝑠𝑢𝑝𝑝(𝐴)
Definition (Lift rule)
𝑙𝑖𝑓𝑡 𝐴 → 𝐵 =
𝑠𝑢𝑝𝑝(𝐴 ∪ 𝐵)
𝑠𝑢𝑝𝑝 𝐴 𝑠𝑢𝑝𝑝(𝐵)
Let 𝐴, 𝐵 be set of items
Past baskets are then analyzed to compute all possible
confidence and lift.
A minimum support threshold, confidence threshold and
lift threshold are then define. All rules 𝐴 → 𝐵 that satisfy
three condition are selected for recommendation
Very heavy computation
Not scalable to large catalog
Kernel matrix
containing
item-item
similarity
item catalog
itemcatalog
𝐿 =
Let 𝐿 be the kernel matrix associated with the DPP
𝐿 defined a discrete DPP such that the probability to
observe the set 𝐴 is proportional to det 𝐿 𝐴 with 𝐿 𝐴 the
principal submatrix of 𝐿 indexed by items in 𝐴
𝑝 𝐴 =
det 𝐿 𝐴
det(𝐿 + 𝐼)
DPP are suitable to model co-
purchase probability
For a long time, associative classifiers have been the state-of-the-art for basket completion until Determinantal Point
Processes (DPPs) show significant improvement. One can also add constraints (e.g. lower price, different category) to classic CF
solutions.
Multiple reasons make DPP relevant for basket completion
Quadratic number of parameters in the number of items 𝑝 while the number of sets grows
exponentially with 𝑝
Entries of the matrix measure similarity between items
Enforce diversity in the sampled set
det 𝐿{1,2} = det
𝐿11 𝐿12
𝐿21 𝐿22
= 𝐿11 𝐿22 − 𝐿12
2
item 1 popularity
correlation between
items 1 and 2
Assuming a low-rank constraint on the kernel matrix allows fast training
[Gartrell et al., 2017]
Let 𝒑 be the number of items in the catalog. We assume that the kernel matrix associated with the DPP is
low-rank of rank 𝐾.
Thus there exist a matrix 𝑽 ∈ ℝ 𝒑×𝑲
such that
Learning
Let ℬ = ℬ1, ⋯ , ℬ 𝑀 a collection of observed baskets – that is subsets of items.
Maximizing the regularized log-likelihood by gradient descent allows to estimate matrix 𝑉
𝑓 𝑉 =
𝑚=1
𝑀
log 𝑝(ℬ 𝑚|𝑉) −
𝛼
2
𝑖=1
𝑝
𝜆𝑖 𝑉𝑖
2
=
𝑚=1
𝑀
log det(𝐿[𝑚]) − 𝑀 log det(𝐿 + 𝐼) −
𝛼
2
𝑖=1
𝑝
𝜆𝑖 𝑉𝑖
2
inversely proportional to
item popularity
𝐿 = 𝑉𝑉 𝑇
Pros
• Efficient learning
• Low memory (𝑝 × 𝐾 coefficient to store)
• Fast prediction
• Scalable to large datasets
Cons
• Baskets larger than 𝐾 have probability 0 by construction
• Model the probability to buy a set of product together
instead of the relevance of the additional item
1. Each target item, noted 𝜏, is model by its own kernel 𝑳 𝝉 ∈ ℝ 𝒑×𝒑
2. Item bias captured in a separate diagonal matrix
3. All those kernels form a cubic tensor 𝑳 ∈ ℝ 𝑝×𝑝×𝑝
which is assumed to be low rank
4. Conversion probability is obtained by applying a logistic like function
We introduce a logistic tensorized extension to low-rank DPP
Goal Directly model the relevance of buying an additional product instead of global coherence of the set
𝐿 𝜏 = 𝑉𝑅 𝜏
2
𝑉 𝑇
+ D2
We introduce a logistic tensorized extension to low-rank DPP
insure to have a valid kernel
Target 𝜏 kernel DPP Basket items latent
factors, common to
all tasks
Basket items biasTarget 𝜏
latent factors
𝑝(𝑦𝜏|ℬ) = 𝜙 ℬ y 𝜏 1 − 𝜙 ℬ
1−𝑦 𝜏
𝜙 ℬ = 1 − 𝑒−𝑤 𝑑𝑒𝑡 𝐿ℬ
= 𝜎(𝑤 𝑑𝑒𝑡 𝐿ℬ)
scaling parameter
Goal Directly model the relevance of buying an additional product instead of global coherence of the set
fifty-five confidential and proprietary 8
We validated our approach on four real world datasets
Unordered baskets Ordered baskets
• Amazon Baby Registries
• Diaper category: 100 products, 10k baskets, 2.4
products/basket
• Diaper+Apparel+Feedings: 3 disjoints categories,
300 products, 17k baskets, 2.6 products/basket
• Belgian Retail Supermarket
• 16,470 products, 88k baskets, 9.6 products/basket
• UK Retail
• 4,071 products, 22k baskets, 18.5 products/basket
• Some basket contains more than 100 products
• Instacart
• Ordered baskets
• Online grocery shopping dataset
• More than 200k users, 50k products, 3M baskets split
over three datasets: train, test, prior
• Filter out test and prior datasets, baskets with less
than 2 products and products that appeared less than
15 times
• Result: 10,531 products, 700k baskets
fifty-five confidential and proprietary 9
We adopt different testing protocols according to the type of baskets
one item is
remove at
random
training set: 70% of baskets
test set: remove one item at random, apply model on left
items. Compute performance on the removed item.
Three protocols
1
Remove one item at random.
For tensorized DPP, the removed item is the target
and is removed at random in both training and
test set.
2
Remove last added item.
For tensorized DPP, the removed item is the target
and is removed in both training and test set.
3
Only tensorized DPP
Training set: target chosen at random
Test set: target is the last added item
Unordered baskets Ordered baskets
And compared it with several baselines
• Our models
• Logistic DPP
• Multi Task DPP without bias (𝐷 ≡ 0)
• Multi Task DPP
• Baselines
• Poisson Factorization (PF): [Gopalan et al., 2013] is a probabilistic
matrix factorization model generally used for recommendation
applications with implicit feedback. one basket = one user.
• Recurrent Neural Network (RNN): [Hidasi et al., 2016] adapted for
session-based recommendations.
• Factorization Machines (FM): [Rendle, 2010] is a general approach
that models 𝑑th-order interactions using low-rank assumptions.
Usually 𝑑 = 2. one basket = one user.
• Low-Rank DPP: [Gartrell et al., 2017].
• Bayesian Low-Rank DPP: [Gartrell et al., 2016] Bayesian learning
of the low-rank DPP model.
• Associative Classifier
fifty-five confidential and proprietary 11
Model performance is evaluated according to Mean Percentile Rank
and Precision@k
=
All items in catalog,
and not in the
basket, sorted from
the most likely to the
less likely
Precision@kMean Percentile Rank
Percentile
rank of left
item
Percentile
rank of left
item
…
Averaged over all test set. The higher the better.
1 if in top 𝑘
0 otherwise
1 if in top 𝑘
0 otherwise
…
Averaged over all test set. The higher the better.
𝑘
𝑘
fifty-five confidential and proprietary 12
Unordered baskets | Performance result on Amazon Diaper dataset
model r MPR Precision@5 Precision@10 Precision@20
Associative Classifier - - 4.16 4.16 4.16
Poisson Factorization 40 50.3 4.78 10.03 19.9
Factorization Machines 60 67.92 24.01 32.62 46.25
Low Rank DPP 30 71.65 25.48 35.80 49.98
Bayesian Low Rank DPP 30 72.38 26.31 36.21 51.51
Logistic DPP 50 71.08 23.7 34.01 48.44
Multi Task DPP no bias 50 77.5 32.7 45.77 61.0
Multi Task DPP 50 78.41 34.73 47.42 62.58
MPR Precision@5 Precision@10 Precision@20
Multi Task DPP vs Low Rank DPP 9.43% 36.28% 32.47% 25.2%
fifty-five confidential and proprietary 13
Unordered baskets | Performance result on Amazon Diaper + Apparel +
Feedings dataset
model r MPR Precision@5 Precision@10 Precision@20
Associative Classifier - - 16.66 16.66 16.66
Poisson Factorization 40 51.36 4.16 5.88 9.08
Factorization Machines 5 65.21 10.62 16.71 24.2
Low Rank DPP 30 70.10 13.10 18.59 26.92
Bayesian Low Rank DPP 30 70.55 13.59 19.51 27.83
Logistic DPP 60 69.61 12.65 19.8 27.86
Multi Task DPP no bias 60 88.77 18.33 28.0 43.57
Multi Task DPP 60 89.80 20.53 30.86 45.79
MPR Precision@5 Precision@10 Precision@20
Multi Task DPP vs Low Rank DPP 28.11% 56.71% 66.01% 70.11%
fifty-five confidential and proprietary 14
Ordered baskets | Performance result on Instacart
model Protocol MPR Precision@5 Precision@10 Precision@20
Factorization Machines (1) 61.10 4.55 6.3 7.67
Low Rank DPP (1) 76.46 7.37 8.07 9.23
Multi Task DPP (1) 80.46 4.62 7.23 10.51
Factorization Machines (2) 62.47 9.35 10.66 11.92
Low Rank DPP (2) 61.16 7.49 8.05 8.8
RNN (2) 73.31 1.08 1.99 3.2
Multi Task DPP (2) 90.07 9.91 13.67 19.97
Multi Task DPP (3) 80.65 5.23 6.05 9.72
𝑟 = 80 except for FM for which 𝑟 = 5
1
Remove one item at random.
For multi-task DPP, the removed item is the target
and is removed at random in both training and test
set.
2
Remove last added item.
For multi-task DPP, the removed item is the target
and is removed in both training and test set.
3
Only multi-task DPP
Training set: target chosen at random
Test set: target is the last added item
Contributions summary of Multi Task Logistic DPP
• Extension of low rank DPP to model effectively classification
problem on discrete data
• Showed effectiveness on the basket completion task
• Model can scale to large catalog thanks to the tensor low rank
formulation
• Training can be parallelized using mini batch gradient descent
Paris• London• Hong Kong•NewYork•Shanghai
Thank you for your attention
Do you have any questions?
www.fifty-five.com | romain@fifty-five.com
fifty-five confidential and proprietary 17
Unordered baskets | Performance result on Belgian Retail dataset
model r MPR Precision@5 Precision@10 Precision@20
Associative Classifier - - X X X
Poisson Factorization 40 87.02 21.46 23.06 23.90
Factorization Machines 10 65.08 20.85 21.10 21.37
Low Rank DPP 76 88.52 21.48 23.29 25.19
Bayesian Low Rank DPP 76 89.08 21.43 23.10 25.12
Logistic DPP 76 87.35 21.17 23.11 25.77
MultiTask DPP no bias 76 87.42 21.02 23.35 25.13
MultiTask DPP 76 87.72 21.46 23.37 25.57
MPR Precision@5 Precision@10 Precision@20
MultiTask DPP vs Low Rank DPP -0.9% -0.1% 0.34% 1.52%
fifty-five confidential and proprietary 18
Unordered baskets | Performance result on UK Retail dataset
model r MPR Precision@5 Precision@10 Precision@20
Associative Classifier - - X X X
Poisson Factorization 100 73.12 1.77 2.31 3.01
Factorization Machines 5 56.91 0.47 0.83 1.5
Low Rank DPP 100 82.74 3.07 4.75 7.6
Bayesian Low Rank DPP 100 61.31 1.07 1.91 3.25
Logistic DPP 100 75.23 3.18 4.99 7.83
MultiTask DPP no bias 100 77.67 3.82 5.98 9.11
MultiTask DPP 100 78.25 4.0 6.2 9.4
MPR Precision@5 Precision@10 Precision@20
MultiTask DPP vs Low Rank DPP -5.43% 30.29% 30.53% 23.68%

More Related Content

Similar to Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
PandasUDFs: One Weird Trick to Scaled Ensembles
PandasUDFs: One Weird Trick to Scaled EnsemblesPandasUDFs: One Weird Trick to Scaled Ensembles
PandasUDFs: One Weird Trick to Scaled EnsemblesDatabricks
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn철민 권
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsankit_ppt
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1khairulhuda242
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5Roger Barga
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with WekaAlbanLevy
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesBigML, Inc
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionJaroslaw Szymczak
 
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...International Islamic University
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with SparkBarak Gitsis
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ FyberDaniel Hen
 
Clinical Data Classification of alzheimer's disease
Clinical Data Classification of alzheimer's diseaseClinical Data Classification of alzheimer's disease
Clinical Data Classification of alzheimer's diseaseGeorge Kalangi
 

Similar to Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five (20)

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
PandasUDFs: One Weird Trick to Scaled Ensembles
PandasUDFs: One Weird Trick to Scaled EnsemblesPandasUDFs: One Weird Trick to Scaled Ensembles
PandasUDFs: One Weird Trick to Scaled Ensembles
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topics
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with Weka
 
Erdi güngör bbs
Erdi güngör bbsErdi güngör bbs
Erdi güngör bbs
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...
Genetic Algorithm based Approach to solve Non-Fractional (0/1) Knapsack Optim...
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
Fractional Knapsack Problem
Fractional Knapsack ProblemFractional Knapsack Problem
Fractional Knapsack Problem
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
weka data mining
weka data mining weka data mining
weka data mining
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ Fyber
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
Clinical Data Classification of alzheimer's disease
Clinical Data Classification of alzheimer's diseaseClinical Data Classification of alzheimer's disease
Clinical Data Classification of alzheimer's disease
 

More from recsysfr

Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...recsysfr
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...recsysfr
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinycluesrecsysfr
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...recsysfr
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemrecsysfr
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...recsysfr
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scalerecsysfr
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Modelrecsysfr
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meeticrecsysfr
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loverecsysfr
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentationrecsysfr
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Grouprecsysfr
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsrecsysfr
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezerrecsysfr
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphsrecsysfr
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsrecsysfr
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?recsysfr
 

More from recsysfr (20)

Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender system
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scale
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Model
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meetic
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and love
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphs
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
 

Recently uploaded

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Recently uploaded (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five

  • 1. Tensorized DPP for Basket Completion September, 27th 2018Romain WARLOP With Jérémie Mary (Criteo) and Mike Gartrell (Criteo)
  • 2. The objective of basket completion is to suggest to a user one or several items according to items already in her cart Associative Classifier DPP Definition (Confidence rule) 𝑐𝑜𝑛𝑓 𝐴 → 𝐵 = 𝑠𝑢𝑝𝑝(𝐴 ∪ 𝐵) 𝑠𝑢𝑝𝑝(𝐴) Definition (Lift rule) 𝑙𝑖𝑓𝑡 𝐴 → 𝐵 = 𝑠𝑢𝑝𝑝(𝐴 ∪ 𝐵) 𝑠𝑢𝑝𝑝 𝐴 𝑠𝑢𝑝𝑝(𝐵) Let 𝐴, 𝐵 be set of items Past baskets are then analyzed to compute all possible confidence and lift. A minimum support threshold, confidence threshold and lift threshold are then define. All rules 𝐴 → 𝐵 that satisfy three condition are selected for recommendation Very heavy computation Not scalable to large catalog Kernel matrix containing item-item similarity item catalog itemcatalog 𝐿 = Let 𝐿 be the kernel matrix associated with the DPP 𝐿 defined a discrete DPP such that the probability to observe the set 𝐴 is proportional to det 𝐿 𝐴 with 𝐿 𝐴 the principal submatrix of 𝐿 indexed by items in 𝐴 𝑝 𝐴 = det 𝐿 𝐴 det(𝐿 + 𝐼) DPP are suitable to model co- purchase probability For a long time, associative classifiers have been the state-of-the-art for basket completion until Determinantal Point Processes (DPPs) show significant improvement. One can also add constraints (e.g. lower price, different category) to classic CF solutions.
  • 3. Multiple reasons make DPP relevant for basket completion Quadratic number of parameters in the number of items 𝑝 while the number of sets grows exponentially with 𝑝 Entries of the matrix measure similarity between items Enforce diversity in the sampled set det 𝐿{1,2} = det 𝐿11 𝐿12 𝐿21 𝐿22 = 𝐿11 𝐿22 − 𝐿12 2 item 1 popularity correlation between items 1 and 2
  • 4. Assuming a low-rank constraint on the kernel matrix allows fast training [Gartrell et al., 2017] Let 𝒑 be the number of items in the catalog. We assume that the kernel matrix associated with the DPP is low-rank of rank 𝐾. Thus there exist a matrix 𝑽 ∈ ℝ 𝒑×𝑲 such that Learning Let ℬ = ℬ1, ⋯ , ℬ 𝑀 a collection of observed baskets – that is subsets of items. Maximizing the regularized log-likelihood by gradient descent allows to estimate matrix 𝑉 𝑓 𝑉 = 𝑚=1 𝑀 log 𝑝(ℬ 𝑚|𝑉) − 𝛼 2 𝑖=1 𝑝 𝜆𝑖 𝑉𝑖 2 = 𝑚=1 𝑀 log det(𝐿[𝑚]) − 𝑀 log det(𝐿 + 𝐼) − 𝛼 2 𝑖=1 𝑝 𝜆𝑖 𝑉𝑖 2 inversely proportional to item popularity 𝐿 = 𝑉𝑉 𝑇
  • 5. Pros • Efficient learning • Low memory (𝑝 × 𝐾 coefficient to store) • Fast prediction • Scalable to large datasets Cons • Baskets larger than 𝐾 have probability 0 by construction • Model the probability to buy a set of product together instead of the relevance of the additional item
  • 6. 1. Each target item, noted 𝜏, is model by its own kernel 𝑳 𝝉 ∈ ℝ 𝒑×𝒑 2. Item bias captured in a separate diagonal matrix 3. All those kernels form a cubic tensor 𝑳 ∈ ℝ 𝑝×𝑝×𝑝 which is assumed to be low rank 4. Conversion probability is obtained by applying a logistic like function We introduce a logistic tensorized extension to low-rank DPP Goal Directly model the relevance of buying an additional product instead of global coherence of the set
  • 7. 𝐿 𝜏 = 𝑉𝑅 𝜏 2 𝑉 𝑇 + D2 We introduce a logistic tensorized extension to low-rank DPP insure to have a valid kernel Target 𝜏 kernel DPP Basket items latent factors, common to all tasks Basket items biasTarget 𝜏 latent factors 𝑝(𝑦𝜏|ℬ) = 𝜙 ℬ y 𝜏 1 − 𝜙 ℬ 1−𝑦 𝜏 𝜙 ℬ = 1 − 𝑒−𝑤 𝑑𝑒𝑡 𝐿ℬ = 𝜎(𝑤 𝑑𝑒𝑡 𝐿ℬ) scaling parameter Goal Directly model the relevance of buying an additional product instead of global coherence of the set
  • 8. fifty-five confidential and proprietary 8 We validated our approach on four real world datasets Unordered baskets Ordered baskets • Amazon Baby Registries • Diaper category: 100 products, 10k baskets, 2.4 products/basket • Diaper+Apparel+Feedings: 3 disjoints categories, 300 products, 17k baskets, 2.6 products/basket • Belgian Retail Supermarket • 16,470 products, 88k baskets, 9.6 products/basket • UK Retail • 4,071 products, 22k baskets, 18.5 products/basket • Some basket contains more than 100 products • Instacart • Ordered baskets • Online grocery shopping dataset • More than 200k users, 50k products, 3M baskets split over three datasets: train, test, prior • Filter out test and prior datasets, baskets with less than 2 products and products that appeared less than 15 times • Result: 10,531 products, 700k baskets
  • 9. fifty-five confidential and proprietary 9 We adopt different testing protocols according to the type of baskets one item is remove at random training set: 70% of baskets test set: remove one item at random, apply model on left items. Compute performance on the removed item. Three protocols 1 Remove one item at random. For tensorized DPP, the removed item is the target and is removed at random in both training and test set. 2 Remove last added item. For tensorized DPP, the removed item is the target and is removed in both training and test set. 3 Only tensorized DPP Training set: target chosen at random Test set: target is the last added item Unordered baskets Ordered baskets
  • 10. And compared it with several baselines • Our models • Logistic DPP • Multi Task DPP without bias (𝐷 ≡ 0) • Multi Task DPP • Baselines • Poisson Factorization (PF): [Gopalan et al., 2013] is a probabilistic matrix factorization model generally used for recommendation applications with implicit feedback. one basket = one user. • Recurrent Neural Network (RNN): [Hidasi et al., 2016] adapted for session-based recommendations. • Factorization Machines (FM): [Rendle, 2010] is a general approach that models 𝑑th-order interactions using low-rank assumptions. Usually 𝑑 = 2. one basket = one user. • Low-Rank DPP: [Gartrell et al., 2017]. • Bayesian Low-Rank DPP: [Gartrell et al., 2016] Bayesian learning of the low-rank DPP model. • Associative Classifier
  • 11. fifty-five confidential and proprietary 11 Model performance is evaluated according to Mean Percentile Rank and Precision@k = All items in catalog, and not in the basket, sorted from the most likely to the less likely Precision@kMean Percentile Rank Percentile rank of left item Percentile rank of left item … Averaged over all test set. The higher the better. 1 if in top 𝑘 0 otherwise 1 if in top 𝑘 0 otherwise … Averaged over all test set. The higher the better. 𝑘 𝑘
  • 12. fifty-five confidential and proprietary 12 Unordered baskets | Performance result on Amazon Diaper dataset model r MPR Precision@5 Precision@10 Precision@20 Associative Classifier - - 4.16 4.16 4.16 Poisson Factorization 40 50.3 4.78 10.03 19.9 Factorization Machines 60 67.92 24.01 32.62 46.25 Low Rank DPP 30 71.65 25.48 35.80 49.98 Bayesian Low Rank DPP 30 72.38 26.31 36.21 51.51 Logistic DPP 50 71.08 23.7 34.01 48.44 Multi Task DPP no bias 50 77.5 32.7 45.77 61.0 Multi Task DPP 50 78.41 34.73 47.42 62.58 MPR Precision@5 Precision@10 Precision@20 Multi Task DPP vs Low Rank DPP 9.43% 36.28% 32.47% 25.2%
  • 13. fifty-five confidential and proprietary 13 Unordered baskets | Performance result on Amazon Diaper + Apparel + Feedings dataset model r MPR Precision@5 Precision@10 Precision@20 Associative Classifier - - 16.66 16.66 16.66 Poisson Factorization 40 51.36 4.16 5.88 9.08 Factorization Machines 5 65.21 10.62 16.71 24.2 Low Rank DPP 30 70.10 13.10 18.59 26.92 Bayesian Low Rank DPP 30 70.55 13.59 19.51 27.83 Logistic DPP 60 69.61 12.65 19.8 27.86 Multi Task DPP no bias 60 88.77 18.33 28.0 43.57 Multi Task DPP 60 89.80 20.53 30.86 45.79 MPR Precision@5 Precision@10 Precision@20 Multi Task DPP vs Low Rank DPP 28.11% 56.71% 66.01% 70.11%
  • 14. fifty-five confidential and proprietary 14 Ordered baskets | Performance result on Instacart model Protocol MPR Precision@5 Precision@10 Precision@20 Factorization Machines (1) 61.10 4.55 6.3 7.67 Low Rank DPP (1) 76.46 7.37 8.07 9.23 Multi Task DPP (1) 80.46 4.62 7.23 10.51 Factorization Machines (2) 62.47 9.35 10.66 11.92 Low Rank DPP (2) 61.16 7.49 8.05 8.8 RNN (2) 73.31 1.08 1.99 3.2 Multi Task DPP (2) 90.07 9.91 13.67 19.97 Multi Task DPP (3) 80.65 5.23 6.05 9.72 𝑟 = 80 except for FM for which 𝑟 = 5 1 Remove one item at random. For multi-task DPP, the removed item is the target and is removed at random in both training and test set. 2 Remove last added item. For multi-task DPP, the removed item is the target and is removed in both training and test set. 3 Only multi-task DPP Training set: target chosen at random Test set: target is the last added item
  • 15. Contributions summary of Multi Task Logistic DPP • Extension of low rank DPP to model effectively classification problem on discrete data • Showed effectiveness on the basket completion task • Model can scale to large catalog thanks to the tensor low rank formulation • Training can be parallelized using mini batch gradient descent
  • 16. Paris• London• Hong Kong•NewYork•Shanghai Thank you for your attention Do you have any questions? www.fifty-five.com | romain@fifty-five.com
  • 17. fifty-five confidential and proprietary 17 Unordered baskets | Performance result on Belgian Retail dataset model r MPR Precision@5 Precision@10 Precision@20 Associative Classifier - - X X X Poisson Factorization 40 87.02 21.46 23.06 23.90 Factorization Machines 10 65.08 20.85 21.10 21.37 Low Rank DPP 76 88.52 21.48 23.29 25.19 Bayesian Low Rank DPP 76 89.08 21.43 23.10 25.12 Logistic DPP 76 87.35 21.17 23.11 25.77 MultiTask DPP no bias 76 87.42 21.02 23.35 25.13 MultiTask DPP 76 87.72 21.46 23.37 25.57 MPR Precision@5 Precision@10 Precision@20 MultiTask DPP vs Low Rank DPP -0.9% -0.1% 0.34% 1.52%
  • 18. fifty-five confidential and proprietary 18 Unordered baskets | Performance result on UK Retail dataset model r MPR Precision@5 Precision@10 Precision@20 Associative Classifier - - X X X Poisson Factorization 100 73.12 1.77 2.31 3.01 Factorization Machines 5 56.91 0.47 0.83 1.5 Low Rank DPP 100 82.74 3.07 4.75 7.6 Bayesian Low Rank DPP 100 61.31 1.07 1.91 3.25 Logistic DPP 100 75.23 3.18 4.99 7.83 MultiTask DPP no bias 100 77.67 3.82 5.98 9.11 MultiTask DPP 100 78.25 4.0 6.2 9.4 MPR Precision@5 Precision@10 Precision@20 MultiTask DPP vs Low Rank DPP -5.43% 30.29% 30.53% 23.68%