Recommendation for new users at Criteo

•

2 likes•436 views

1) The document discusses Criteo's use of large-scale matrix factorization with randomized SVD and approximate nearest neighbors to provide recommendations for new users at an enormous scale of 200 billion recommendations across hundreds of millions of users and partners. 2) Criteo built a pipeline that uses user timelines, a co-event matrix, point-wise mutual information, randomized SVD, and KNN indexing to train user and product embeddings and provide recommendations from pre-computed indices. 3) Offline evaluation of the recommendations compared to baseline approaches showed promising results, and qualitative evaluations also provided positive feedback, though there remain opportunities for deeper modeling and training techniques at larger scales.

Engineering

Olivier Koch, Criteo
RecSys London Meetup - Nov 8th, 2018
Large-scale
recommendation
for new users

2 •
Joint work with Ivan Lobov, Mohamed Amine
Benhalloum, Dmitry Parfenchik, Alexandre Gillotte, Alois
Bissuel, Vincent Grosbois, Sergei Lebedev, Flavian Vasile

3 •
1. Context
2. Large-scale matrix factorization with randomized SVD
3. Offline evaluation methods
4. What's next?
Outline

4 •
Buy ad space on publishers’ websites.
Build banners showing products that users will like / want to buy.
Get paid if users click / buy the product.
What / Who is Criteo again?

5 •
What / Who is Criteo again?
3 billion ads/day
5 billion products
100 ms

8 •
2B users
20K partners
~1M products/partner
Hundreds of possible campaigns per user
In 50 ms!
At scale

9 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding

10 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding

11 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding
The Recommendation problem

12 •
Instead of letting a different model do the
bidding/campaign selection, how about we do
recommendation for all user - partner pairs?
200B recommendations anyone?

14 •
Singular value decomposition
A U S VT
m x n m x m m x n n x n
=

15 •
The catch
m = n = hundred of million items

16 •
Randomized SVD
Trick: Approximate A with a tall-and-tiny matrix Q

21 •
Randomized SVD
0
20
40
60
80
100
120
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
173
177
singular values

22 •
Finding structure with randomness: Probabilistic algorithms for constructing
approximate matrix decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A.
Tropp, Journal SIAM, May 2011
Randomized SVD

23 •
spark-rsvd
https://github.com/criteo/Spark-RSVD

24 •
spark-rsvd (blog post)
https://medium.com/@alois.bissuel/6695b649f519

26 •
Approximate nearest neighbors with Annoy
https://erikbern.com/2015/10/01/nearest-neighbors-and-vector-models-part-2-how-to-search-in-high-dimensional-spaces.html
Credits: Erik Bernhardsson

27 •
Putting it all together
User timelines
CoEvent
matrix
PMI
matrix
R-SVD
KNN
Indexing
KNN Indices
training
inference
User
embedding
Product
vectors
KNN SearchUser timelines Recommend
ations

28 •
Putting it all together
memcacheRecommen-
dations
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners

29 •
Putting it all together
memcacheRecommendati
ons
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners
Simpler
(« no model »)
Evolutive
(reco-based)

30 •
Offline pipeline runs at scale in 5-10 hours with 100 Spark
executors on ~300M timelines
Spark, scala, python
Scheduled every day
The best is the enemy of the good (good enough for an AB test)
Putting it all together

31 •
Good vs Best trade-off
Not scalable
Not prod-grade
A few weeks
Scalable
Prod-grade
Many months
Scalable
Not-quite-prod-grade
Several months

33 •
• Global best-of (per partner)
• Mixture of « sources » (best-of-by-X) merged into a pClick
model
Baselines

34 •
Precision @ k over pairs of partners
Offline metrics
train validation

40 •
Fusing CF and metadata (content2vec)
Deeper representations of users and products (graph
convolutions, recurrent neural nets)
Train at scale with TF

41 •
tf-yarn: train TensorFlow models on YARN in just a few lines of code!
https://github.com/criteo/tf-yarn

42 •
Acquisition provides new challenges for Recommendation algorithms
MF (via R-SVD) is an attractive approach to try
We built a pipeline leveraging R-SVD and KNN at scale (~300M users, hundreds of
partners) with promising offline results
Qualitative evaluation matters (on top of the quantitative one)
There are many things coming up next!
Summary

43 •
Thank you!
o.koch@criteo.com
ailab.criteo.com

Similar to Recommendation for new users at Criteo

UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...UKSG: connecting the knowledge community

ECIR Recommendation ChallengesDaniel Kohlsdorf

Reco4J @ Munich Meetup (April 18th)Alessandro Negro

Guerilla Human Computer Interaction and Customer Based DesignQuentin Christensen

Cognistreamer's use caseAccelerate Project

Reco4J @ London Meetup (June 26th)Alessandro Negro

Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell

Microservices.pdfSelmaJelovac1

BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...lisbk

Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...National Information Standards Organization (NISO)

Ddz project new-approach-091124Saco Heijboer

Practical Steps to Address PiracyChris Shillum

Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Redis Labs

CFPB Design Manual & Capital Framework at OSCONMollie Bates

A Flexible Recommendation System for Cable TVFrancisco Couto

A flexible recommenndation system for Cable TVIntoTheMinds

Agile development and operation of complex systems in multitechnology and mul...Citadelh2020

Developing recommendation systems to support open source software developers ...Davide Ruscio

tip oopt pse-summit2017domenico di mola

CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...Search Computing

Similar to Recommendation for new users at Criteo (20)

UKSG webinar: Authentication technology update: RA21 and OpenAthens with Josh...

ECIR Recommendation Challenges

Reco4J @ Munich Meetup (April 18th)

Guerilla Human Computer Interaction and Customer Based Design

Cognistreamer's use case

Reco4J @ London Meetup (June 26th)

Open Chemistry: Input Preparation, Data Visualization & Analysis

Microservices.pdf

BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...

Hazen, Morse, and Varnum "Fall 2022 ODI Conformance Statement Workshop for Li...

Ddz project new-approach-091124

Practical Steps to Address Piracy

Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...

CFPB Design Manual & Capital Framework at OSCON

A Flexible Recommendation System for Cable TV

A flexible recommenndation system for Cable TV

Agile development and operation of complex systems in multitechnology and mul...

Developing recommendation systems to support open source software developers ...

tip oopt pse-summit2017

CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DB...

Recently uploaded

RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfKamal Acharya

The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions

Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringC Sai Kiran

Toll tax management system project report..pdfKamal Acharya

Online resume builder management system project report.pdfKamal Acharya

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult

RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsAtif Razi

Event Management System Vb Net Project Report.pdfKamal Acharya

Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringC Sai Kiran

fluid mechanics gate notes . gate all pyqs answerapareshmondalnita

BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGKOUSTAV SARKAR

Courier management system project report.pdfKamal Acharya

Automobile Management System Project Report.pdfKamal Acharya

Arduino based vehicle speed tracker projectRased Khan

KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringDr. Radhey Shyam

Natalia Rutkowska - BIM School Course in Krakówbim.edu.pl

Scaling in conventional MOSFET for constant electric field and constant voltageRCC Institute of Information Technology

Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxMd. Shahidul Islam Prodhan

The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxCenterEnamel

Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42

Recently uploaded (20)

RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf

The Benefits and Techniques of Trenchless Pipe Repair.pdf

Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering

Toll tax management system project report..pdf

Online resume builder management system project report.pdf

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx

RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions

Event Management System Vb Net Project Report.pdf

Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering

fluid mechanics gate notes . gate all pyqs answer

BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING

Courier management system project report.pdf

Automobile Management System Project Report.pdf

Arduino based vehicle speed tracker project

KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering

Natalia Rutkowska - BIM School Course in Kraków

Scaling in conventional MOSFET for constant electric field and constant voltage

Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx

The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx

Quality defects in TMT Bars, Possible causes and Potential Solutions.

Recommendation for new users at Criteo

1. Olivier Koch, Criteo RecSys London Meetup - Nov 8th, 2018 Large-scale recommendation for new users

2. 2 • Joint work with Ivan Lobov, Mohamed Amine Benhalloum, Dmitry Parfenchik, Alexandre Gillotte, Alois Bissuel, Vincent Grosbois, Sergei Lebedev, Flavian Vasile

3. 3 • 1. Context 2. Large-scale matrix factorization with randomized SVD 3. Offline evaluation methods 4. What's next? Outline

4. 4 • Buy ad space on publishers’ websites. Build banners showing products that users will like / want to buy. Get paid if users click / buy the product. What / Who is Criteo again?

5. 5 • What / Who is Criteo again? 3 billion ads/day 5 billion products 100 ms

6. 6 • Retargeting ~ a few hours

7. 7 • Acquisition ? ~ a few days/weeks

8. 8 • 2B users 20K partners ~1M products/partner Hundreds of possible campaigns per user In 50 ms! At scale

9. 9 • The Acquisition pipeline Campaign selection Product selection (Recommendation) Bidding

10. 10 • The Acquisition pipeline Campaign selection Product selection (Recommendation) Bidding

11. 11 • The Acquisition pipeline Campaign selection Product selection (Recommendation) Bidding The Recommendation problem

12. 12 • Instead of letting a different model do the bidding/campaign selection, how about we do recommendation for all user - partner pairs? 200B recommendations anyone?

13. Large-scale MF with R-SVD

14. 14 • Singular value decomposition A U S VT m x n m x m m x n n x n =

15. 15 • The catch m = n = hundred of million items

16. 16 • Randomized SVD Trick: Approximate A with a tall-and-tiny matrix Q

17. 17 • Randomized SVD

18. 18 • Randomized SVD How do we find Q?

19. 19 • Randomized SVD

20. 20 • Randomized SVD

21. 21 • Randomized SVD 0 20 40 60 80 100 120 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 141 145 149 153 157 161 165 169 173 177 singular values

22. 22 • Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp, Journal SIAM, May 2011 Randomized SVD

23. 23 • spark-rsvd https://github.com/criteo/Spark-RSVD

24. 24 • spark-rsvd (blog post) https://medium.com/@alois.bissuel/6695b649f519

25. 25 • Point-wise mutual information

26. 26 • Approximate nearest neighbors with Annoy https://erikbern.com/2015/10/01/nearest-neighbors-and-vector-models-part-2-how-to-search-in-high-dimensional-spaces.html Credits: Erik Bernhardsson

27. 27 • Putting it all together User timelines CoEvent matrix PMI matrix R-SVD KNN Indexing KNN Indices training inference User embedding Product vectors KNN SearchUser timelines Recommend ations

28. 28 • Putting it all together memcacheRecommen- dations HDFS All users x partners RecoService Campaign selection users x ~50 partners

29. 29 • Putting it all together memcacheRecommendati ons HDFS All users x partners RecoService Campaign selection users x ~50 partners Simpler (« no model ») Evolutive (reco-based)

30. 30 • Offline pipeline runs at scale in 5-10 hours with 100 Spark executors on ~300M timelines Spark, scala, python Scheduled every day The best is the enemy of the good (good enough for an AB test) Putting it all together

31. 31 • Good vs Best trade-off Not scalable Not prod-grade A few weeks Scalable Prod-grade Many months Scalable Not-quite-prod-grade Several months

32. Offline evaluation

33. 33 • • Global best-of (per partner) • Mixture of « sources » (best-of-by-X) merged into a pClick model Baselines

34. 34 • Precision @ k over pairs of partners Offline metrics train validation

35. 35 • Qualitative evaluation

36. 36 • Qualitative evaluation

37. 37 • Qualitative evaluation

38. 38 • Qualitative evaluation

39. What’s next?

40. 40 • Fusing CF and metadata (content2vec) Deeper representations of users and products (graph convolutions, recurrent neural nets) Train at scale with TF

41. 41 • tf-yarn: train TensorFlow models on YARN in just a few lines of code! https://github.com/criteo/tf-yarn

42. 42 • Acquisition provides new challenges for Recommendation algorithms MF (via R-SVD) is an attractive approach to try We built a pipeline leveraging R-SVD and KNN at scale (~300M users, hundreds of partners) with promising offline results Qualitative evaluation matters (on top of the quantitative one) There are many things coming up next! Summary

43. 43 • Thank you! o.koch@criteo.com ailab.criteo.com

Recommendation for new users at Criteo

Recommended

Recommended

More Related Content

Similar to Recommendation for new users at Criteo

Similar to Recommendation for new users at Criteo (20)

Recently uploaded

Recently uploaded (20)

Recommendation for new users at Criteo