SlideShare a Scribd company logo
Olivier Koch, Criteo
RecSys London Meetup - Nov 8th, 2018
Large-scale
recommendation
for new users
2 •
Joint work with Ivan Lobov, Mohamed Amine
Benhalloum, Dmitry Parfenchik, Alexandre Gillotte, Alois
Bissuel, Vincent Grosbois, Sergei Lebedev, Flavian Vasile
3 •
1. Context
2. Large-scale matrix factorization with randomized SVD
3. Offline evaluation methods
4. What's next?
Outline
4 •
Buy ad space on publishers’ websites.
Build banners showing products that users will like / want to buy.
Get paid if users click / buy the product.
What / Who is Criteo again?
5 •
What / Who is Criteo again?
3 billion ads/day
5 billion products
100 ms
6 •
Retargeting
~ a few hours
7 •
Acquisition
?
~ a few days/weeks
8 •
2B users
20K partners
~1M products/partner
Hundreds of possible campaigns per user
In 50 ms!
At scale
9 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding
10 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding
11 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding
The Recommendation problem
12 •
Instead of letting a different model do the
bidding/campaign selection, how about we do
recommendation for all user - partner pairs?
200B recommendations anyone?
Large-scale MF
with R-SVD
14 •
Singular value decomposition
A U S VT
m x n m x m m x n n x n
=
15 •
The catch
m = n = hundred of million items
16 •
Randomized SVD
Trick: Approximate A with a tall-and-tiny matrix Q
17 •
Randomized SVD
18 •
Randomized SVD
How do we find Q?
19 •
Randomized SVD
20 •
Randomized SVD
21 •
Randomized SVD
0
20
40
60
80
100
120
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
173
177
singular values
22 •
Finding structure with randomness: Probabilistic algorithms for constructing
approximate matrix decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A.
Tropp, Journal SIAM, May 2011
Randomized SVD
23 •
spark-rsvd
https://github.com/criteo/Spark-RSVD
24 •
spark-rsvd (blog post)
https://medium.com/@alois.bissuel/6695b649f519
25 •
Point-wise mutual information
26 •
Approximate nearest neighbors with Annoy
https://erikbern.com/2015/10/01/nearest-neighbors-and-vector-models-part-2-how-to-search-in-high-dimensional-spaces.html
Credits: Erik Bernhardsson
27 •
Putting it all together
User timelines
CoEvent
matrix
PMI
matrix
R-SVD
KNN
Indexing
KNN Indices
training
inference
User
embedding
Product
vectors
KNN SearchUser timelines Recommend
ations
28 •
Putting it all together
memcacheRecommen-
dations
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners
29 •
Putting it all together
memcacheRecommendati
ons
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners
Simpler
(« no model »)
Evolutive
(reco-based)
30 •
Offline pipeline runs at scale in 5-10 hours with 100 Spark
executors on ~300M timelines
Spark, scala, python
Scheduled every day
The best is the enemy of the good (good enough for an AB test)
Putting it all together
31 •
Good vs Best trade-off
Not scalable
Not prod-grade
A few weeks
Scalable
Prod-grade
Many months
Scalable
Not-quite-prod-grade
Several months
Offline
evaluation
33 •
• Global best-of (per partner)
• Mixture of « sources » (best-of-by-X) merged into a pClick
model
Baselines
34 •
Precision @ k over pairs of partners
Offline metrics
train validation
35 •
Qualitative evaluation
36 •
Qualitative evaluation
37 •
Qualitative evaluation
38 •
Qualitative evaluation
What’s next?
40 •
Fusing CF and metadata (content2vec)
Deeper representations of users and products (graph
convolutions, recurrent neural nets)
Train at scale with TF
41 •
tf-yarn: train TensorFlow models on YARN in just a few lines of code!
https://github.com/criteo/tf-yarn
42 •
Acquisition provides new challenges for Recommendation algorithms
MF (via R-SVD) is an attractive approach to try
We built a pipeline leveraging R-SVD and KNN at scale (~300M users, hundreds of
partners) with promising offline results
Qualitative evaluation matters (on top of the quantitative one)
There are many things coming up next!
Summary
43 •
Thank you!
o.koch@criteo.com
ailab.criteo.com

More Related Content

Similar to Recommendation for new users at Criteo

UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...
UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...
UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...UKSG: connecting the knowledge community
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation ChallengesDaniel Kohlsdorf
 
Reco4J @ Munich Meetup (April 18th)
Reco4J @ Munich Meetup (April 18th)Reco4J @ Munich Meetup (April 18th)
Reco4J @ Munich Meetup (April 18th)Alessandro Negro
 
Guerilla Human Computer Interaction and Customer Based Design
Guerilla Human Computer Interaction and Customer Based DesignGuerilla Human Computer Interaction and Customer Based Design
Guerilla Human Computer Interaction and Customer Based DesignQuentin Christensen
 
Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)Alessandro Negro
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell
 
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...lisbk
 
Ddz project new-approach-091124
Ddz project new-approach-091124Ddz project new-approach-091124
Ddz project new-approach-091124Saco Heijboer
 
Practical Steps to Address Piracy
Practical Steps to Address PiracyPractical Steps to Address Piracy
Practical Steps to Address PiracyChris Shillum
 
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Redis Labs
 
CFPB Design Manual & Capital Framework at OSCON
CFPB Design Manual & Capital Framework at OSCONCFPB Design Manual & Capital Framework at OSCON
CFPB Design Manual & Capital Framework at OSCONMollie Bates
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVIntoTheMinds
 
Agile development and operation of complex systems in multitechnology and mul...
Agile development and operation of complex systems in multitechnology and mul...Agile development and operation of complex systems in multitechnology and mul...
Agile development and operation of complex systems in multitechnology and mul...Citadelh2020
 
Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Davide Ruscio
 
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...Search Computing
 

Similar to Recommendation for new users at Criteo (20)

UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...
UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...
UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation Challenges
 
Reco4J @ Munich Meetup (April 18th)
Reco4J @ Munich Meetup (April 18th)Reco4J @ Munich Meetup (April 18th)
Reco4J @ Munich Meetup (April 18th)
 
Guerilla Human Computer Interaction and Customer Based Design
Guerilla Human Computer Interaction and Customer Based DesignGuerilla Human Computer Interaction and Customer Based Design
Guerilla Human Computer Interaction and Customer Based Design
 
Cognistreamer's use case
Cognistreamer's use caseCognistreamer's use case
Cognistreamer's use case
 
Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
Microservices.pdf
Microservices.pdfMicroservices.pdf
Microservices.pdf
 
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
 
Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...
Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...
Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...
 
Ddz project new-approach-091124
Ddz project new-approach-091124Ddz project new-approach-091124
Ddz project new-approach-091124
 
Practical Steps to Address Piracy
Practical Steps to Address PiracyPractical Steps to Address Piracy
Practical Steps to Address Piracy
 
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
 
CFPB Design Manual & Capital Framework at OSCON
CFPB Design Manual & Capital Framework at OSCONCFPB Design Manual & Capital Framework at OSCON
CFPB Design Manual & Capital Framework at OSCON
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
Agile development and operation of complex systems in multitechnology and mul...
Agile development and operation of complex systems in multitechnology and mul...Agile development and operation of complex systems in multitechnology and mul...
Agile development and operation of complex systems in multitechnology and mul...
 
Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...
 
tip oopt pse-summit2017
tip oopt pse-summit2017tip oopt pse-summit2017
tip oopt pse-summit2017
 
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...
CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...
 

Recently uploaded

RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfKamal Acharya
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringC Sai Kiran
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdfKamal Acharya
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdfKamal Acharya
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsAtif Razi
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdfKamal Acharya
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringC Sai Kiran
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerapareshmondalnita
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGKOUSTAV SARKAR
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdfKamal Acharya
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdfKamal Acharya
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectRased Khan
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringDr. Radhey Shyam
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Krakówbim.edu.pl
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageRCC Institute of Information Technology
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxMd. Shahidul Islam Prodhan
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxCenterEnamel
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
 

Recently uploaded (20)

RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answer
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 

Recommendation for new users at Criteo