Deep Learning for Recommender Systems
Marcel Kurovski O‘REILLY AI, New York, April 18th 2019
?
!
"
2
Marcel Kurovski
Data Scientist
Recommender Systems
Deep Learning
Reinforcement Learning
Data Science to Production
3
1. Motivation
2. Basics and Overview
3. Deep Learning for Vehicle Recommendations
4. Scalability and Production
Agenda
4
Annual Data Sphere increases exponentially
International Data Corporation: Data Age 2025 study, April 2017
Information Load
à Humans
Human Processing
Capacity
5
Information and Choice Overload
https://www.linkedin.com/pulse/its-information-overload-filter-failure-productivity-industry-zayats/
https://en.wikipedia.org/wiki/Clay_Shirky
“It‘s not information overload. It‘s filter failure." - Clay Shirky
6
- Covington et al.
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
Google Trends for
“Deep Learning“
“Deep Learning becomes a
general-purpose solution for
nearly all learning problems."
Recommendations are everywhere
7
http://fortune.com/2012/07/30/amazons-recommendation-secret/8
„The company reported a 29%sales
increase to $12.83 billion [...]
Amazon has integrated
recommendations into nearly every part
of the purchasing process from product
discovery to checkout.“
9 Gomez-Uribe, Carlos A. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation (2015)
„Our recommender system […]
in total influences choice for about
80% of hours streamed at Netflix.
The remaining 20% comes from search
[...]“
Suche
Empfehlungen
Recommendations
Search
10 Gomez-Uribe, Carlos A. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation (2015)
„Reduction of monthly churn both increases the lifetime value of an existing
subscriber, and reduces the number of new subscribers we need to acquire to
replace cancelled members.
We think the combined effect of
personalization and recommendations
save us more than $1B per year.“
Suche
Empfehlungen
11
1. Motivation
2. Basics and Overview
3. Deep Learning for Vehicle Recommendations
4. Scalability and Production
Agenda
Interactions
12
m
users
1 1 1
? 1 ? ? 1 ?
1 1 1
1 1 1
n items
Collaborative Filtering
13
Muse
Arctic Monkeys
The Killers
Coldplay
Bloc Party
Check out
Bloc Party
Check out
Muse
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
Matrix Factorization
14
15
Recommender Systems for IF
SPARSITY
16
Cold Start
http://www.yusp.com/wp-content/uploads/2015/07/cold-start-problem-recommender-systems-1.jpg
17
Item Information User Information Contextual
Information
Types of Content
Content-based Filtering
18
1 1 1
? 1 ? ? 1 ?
1 1 1
1 1 1
model
color
mileageage
gender
income
19
Capture Nonlinear
Relationships
Reduce Feature
Engineering Effort
Flexible and Holistic
Approach
Improve Predictive
Capability
Deep Learning for Recommender Systems (DLRS)
see Slide on References, Details: https://bit.ly/2WuS4Zq
Domains and Types for DLRS
20
DNNs
CNNs
RNNs
AEs
Other
Other
2017
2018
2009
2015 2015
2017
2016
2015
2018
2016
2013
2018
2018
2017
2018
2018
2018
2018
2018
2017
https://bit.ly/2WuS4Zq
Cheng, Heng-Tze et al.: Wide and Deep Learning for Recommender Systems (2016)
Wide and Deep Learning for App-Recos
Combine Memorization and Generalization
21
Cheng, Heng-Tze et al.: Wide and Deep Learning for Recommender Systems (2016)
Wide and Deep Learning for App-Recos
Combine Memorization and Generalization
22
Deep
Component
Embeddings
Wide
Component
Session-based Recommendations
Leverage Sequential Information to Improve Relevance
www.netflix.com23
t
DESIGNATED
SURVIVOR
DARK
DESIGNATED
SURVIVOR
DARK
› HOUSE OF CARDS
› STRANGER THINGS
› HOUSE OF CARDS
› STRANGER THINGS
STRANGER
THINGS
HOUSE OF
CARDS
Session-based Recommendations
Leverage Sequential Information to Improve Relevance
Quadrana et al.: Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks (2017)24
25
1. Motivation
2. Basics and Overview
3. Deep Learning for Vehicle Recommendations
4. Scalability and Production
Agenda
Vehicle Recommendations: End-to-End Approach
26
Candidate
Generation
Serving Ranking
Preprocessing Classifier
Training
Data
1.
2.
3.
Vehicle Recommendations: Technologies
Frameworks and Hardware for Training and Inference
27
Vehicle Recommendations: Data
28
Users & Interactions
Registered Users
Sample Size: 100,000 Users
Events: View, Bookmark, Contact
Time-based
Train-Test-Split
CW
14
CW
15
CW
16
CW
17
CW
18
April 2017 May
Training Test
85 : 15
adapted from http://www.kdnuggets.com/2016/02/nine-datasets-investigating-recommender-systems.html
Sparsity Comparison
29
MovieLens 1M: 4.26% MovieLens 20M: 0.53%
Last.fm: 0.28% Vehicles All: 0.0046%
~8M interactions between 100k users and 1.7M items
Approach: Preprocessing (1)
30
Technical
§ Data Extraction (SQL, HDFS)
§ Data Type Conversions
§ User and Item ID Contiguation
§ Weekly Profile Overlap
§ User Set Sampling
Content-related
§ Category-based Negative
Sampling
§ Assign Binary Labels {0, 1}
§ Outlier Removal and Feature
Normalization
§ User Profile Feature
Conversion
Approach: Preprocessing (2)
31
0.4 0.4 0 0 0.2 0 0 0
∅ = 9,000€ # = 1,817€
uprice
ucolor
8,500€
7,000€
10,000€
7,500€
12,000€
deterministic stochastic
32
?
!
"
33
categorical features
many-hot-encoding one-hot-encoding
feature values
ucat icat
eclimatisation
icont
embeddinguser
consumption first_reg price...
embeddingi, cont
ucont
embeddingu,cont
...
outlier removal
z-normalisation
ELU (256)
ELU (128)
ELU (64)
embeddingitem
...
...
climatisation color
ecolor etransmission
transmission
Probability that user u
likes vehicle i
meanconsumption meanprice
stddevconsumption stddevprice
...
concatenateconcatenate
PreprocessingEmbeddingDeepComponent
outlier removal
z-normalisation
34
categorical features
many-hot-encoding one-hot-encoding
feature values
ucat icat
eclimatisation
icont
embeddinguser
consumption first_reg price...
embeddingi, cont
ucont
embeddingu,cont
...
outlier removal
z-normalisation
ELU (256)
ELU (128)
ELU (64)
embeddingitem
...
...
climatisation color
ecolor etransmission
transmission
Probability that user u
likes vehicle i
meanconsumption meanprice
stddevconsumption stddevprice
...
concatenateconcatenate
PreprocessingEmbeddingDeepComponent
outlier removal
z-normalisation
UserNet ItemNet
RankNet
minimize
minimize
Adam Optimizer: Stochastic Gradient Descent with adaptive learning rate and adaptive momentum
Approach: Classifier Training
35 35
RankNet
eu
u
UserNet
ei
i
ItemNet
p ( i | u )
class_loss
sim_loss
Adam
Optimizer
Adam
Optimizer
Approach: Cost Functions
36
1
2
sim_loss
https://erikbern.com/2016/06/02/approximate-nearest-news.html
Candidate Generation
Apply Approximate Nearest Neighbor Search to Embeddings
37
x1
x2
5 approximate itemnearest neighbors search user embedding
Intuition: Embedding Similarity Regularization
38
x1
x2
x3
x1
x2
u
i euei
embedding
✓
✘
⍺
⍺
Vehicle Recommendations: Ranking
Rank Candidates by Descending Interaction Probability p(i|u)
39
… ~ 1.7 M Vehicles
1.
2.
3.
1.
2.
3.
RankNet
Vehicle Recommendations: Serving
Present Top-k Recommendations to the User
40
1.
2.
3.
41
Recommendation Channels
Main Page Favorites Similar Vehicles
Vehicle Recommendations: End-to-End Approach
42
Candidate
Generation
Serving Ranking
Preprocessing Classifier
Training
Data
1.
2.
3.
✓ ✓
✓ ✓
✓
Results: DLRS Recommendation Relevance
43 MAP: mean average precision, comparative results after optimization of hyperparameters
0,20%
0,30%
0,40%
0,50%
0,60%
0,70%
0,80%
0,90%
1,00%
1,10%
k = 1 k = 5 k = 10 k = 30 k = 100
MAP@k
Deep Learning
Hybrid CF-CBF (d=700)
CF (d=100)
1.10%
1.00%
0.90%
0.80%
0.70%
0.60%
0.50%
0.40%
0.30%
0.20%
"
+73%
+143%
44
1. Motivation
2. Basics and Overview
3. Deep Learning for Vehicle Recommendations
4. Scalability and Production
Agenda
Deploying Vehicle Recommendations at Scale
45
item
storage
embeddings
RankNet
UserNet
ItemNet
ANNOY
ANN index
Candidate ServiceRanking Service
Webservice
User Profile API
Recommendation Service
k recommendations
rank candidates
{ei} for eu
get u
get eu
get T
candidates
{ei}
get i
get ei
index
all ei
ANN
search
46
Deep Learning Solved – What’s next?
http://dlrs-workshop.org/wp-content/uploads/2018/10/dlrs2018_welcome.pdf
47
Sequence-based und
Sequence-aware
Causal Inference
(Deep) Reinforcement
Learning
Current Trends in Recommender Systems Research
48
"We can only see a short distance ahead,
but we can see plenty there that needs to
be done."
- Alan Turing
Thank You
Marcel Kurovski
Data Scientist
inovex GmbH
Kupferhütte 1.13,
Schanzenstr. 6-20
51063 Cologne
marcel.kurovski@inovex.de
+49 173 3181 088
Dr. Florian Wilhelm
Principal Data Scientist
Julian Hatzky
Data Science Working Student
References
50
[1] Quadrana, Massimo, Karatzoglou, Alexandros, Hidasi, Balázs, Cremonesi, Paolo. “Personalizing Session-based Recommendations with Hierarchical Recurrent Neural
Networks“ Proceedings of the 11th ACM Conference on Recommender Systems. 2017
[2] Cheng, Heng-Tze, et al. "Wide & deep learning for recommender systems." Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 2016.
[3] Covington, Paul, Jay Adams, and Emre Sargin. "Deep neural networks for youtube recommendations." Proceedings of the 10th ACM Conference on Recommender
Systems. ACM, 2016.
[4] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
[5] Heaton, Jeff. Artificial Intelligence for Humans: Deep Learning and Neural Networks. 2015.
[6] Ricci, Francesco and Rokach, Lior and Shapira, Bracha. Recommender Systems Handbook. Springer-Verlag. 2015
[7] Reinsel, David, Gantz, John, Rydning, John. “Data Age 2025: The Evolution of Data to Life-Critical Don't Focus on Big Data; Focus on the Data That's Big“ International
Data Corporation (IDC). 2017
[8] Gomez-Uribe, Carlos A. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation. 2015
[9] JP Mangalindan: Amazon's recommendation secret. 2012
[10] Christ Johnson: Algorithmis Music Discovery at Spotify. 2014
[11] Maya Hristakeva: Overview of Recommender Algorithms - Part 2. 2015
[12] Alex Gude: The Nine Must-Have Datasets for Investigating Recommender Systems. 2016
[13] Erik Bernhardsson: Approximate nearest news. 2016
[14] Balász Hidasi. 3rd Workshop on Deep Learning for Recommender Systems. 2018
[15] CartStack LLC: Comparison could be killing your online business. 2017
[16] Marina Zayats: “It‘s not information overload; it‘s filter failure.“ Productivity in the Industry 4.0. 2016
References – Want to read more?
51
https://bit.ly/2WuS4Zq
52
Thank You! Questions or Comments?
53

Deep Learning for Recommender Systems

  • 1.
    Deep Learning forRecommender Systems Marcel Kurovski O‘REILLY AI, New York, April 18th 2019 ? ! "
  • 2.
    2 Marcel Kurovski Data Scientist RecommenderSystems Deep Learning Reinforcement Learning Data Science to Production
  • 3.
    3 1. Motivation 2. Basicsand Overview 3. Deep Learning for Vehicle Recommendations 4. Scalability and Production Agenda
  • 4.
    4 Annual Data Sphereincreases exponentially International Data Corporation: Data Age 2025 study, April 2017 Information Load à Humans Human Processing Capacity
  • 5.
    5 Information and ChoiceOverload https://www.linkedin.com/pulse/its-information-overload-filter-failure-productivity-industry-zayats/ https://en.wikipedia.org/wiki/Clay_Shirky “It‘s not information overload. It‘s filter failure." - Clay Shirky
  • 6.
    6 - Covington etal. 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Google Trends for “Deep Learning“ “Deep Learning becomes a general-purpose solution for nearly all learning problems."
  • 7.
  • 8.
    http://fortune.com/2012/07/30/amazons-recommendation-secret/8 „The company reporteda 29%sales increase to $12.83 billion [...] Amazon has integrated recommendations into nearly every part of the purchasing process from product discovery to checkout.“
  • 9.
    9 Gomez-Uribe, CarlosA. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation (2015) „Our recommender system […] in total influences choice for about 80% of hours streamed at Netflix. The remaining 20% comes from search [...]“ Suche Empfehlungen Recommendations Search
  • 10.
    10 Gomez-Uribe, CarlosA. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation (2015) „Reduction of monthly churn both increases the lifetime value of an existing subscriber, and reduces the number of new subscribers we need to acquire to replace cancelled members. We think the combined effect of personalization and recommendations save us more than $1B per year.“ Suche Empfehlungen
  • 11.
    11 1. Motivation 2. Basicsand Overview 3. Deep Learning for Vehicle Recommendations 4. Scalability and Production Agenda
  • 12.
    Interactions 12 m users 1 1 1 ?1 ? ? 1 ? 1 1 1 1 1 1 n items
  • 13.
    Collaborative Filtering 13 Muse Arctic Monkeys TheKillers Coldplay Bloc Party Check out Bloc Party Check out Muse
  • 14.
  • 15.
  • 16.
  • 17.
    17 Item Information UserInformation Contextual Information Types of Content
  • 18.
    Content-based Filtering 18 1 11 ? 1 ? ? 1 ? 1 1 1 1 1 1 model color mileageage gender income
  • 19.
    19 Capture Nonlinear Relationships Reduce Feature EngineeringEffort Flexible and Holistic Approach Improve Predictive Capability Deep Learning for Recommender Systems (DLRS)
  • 20.
    see Slide onReferences, Details: https://bit.ly/2WuS4Zq Domains and Types for DLRS 20 DNNs CNNs RNNs AEs Other Other 2017 2018 2009 2015 2015 2017 2016 2015 2018 2016 2013 2018 2018 2017 2018 2018 2018 2018 2018 2017 https://bit.ly/2WuS4Zq
  • 21.
    Cheng, Heng-Tze etal.: Wide and Deep Learning for Recommender Systems (2016) Wide and Deep Learning for App-Recos Combine Memorization and Generalization 21
  • 22.
    Cheng, Heng-Tze etal.: Wide and Deep Learning for Recommender Systems (2016) Wide and Deep Learning for App-Recos Combine Memorization and Generalization 22 Deep Component Embeddings Wide Component
  • 23.
    Session-based Recommendations Leverage SequentialInformation to Improve Relevance www.netflix.com23 t DESIGNATED SURVIVOR DARK DESIGNATED SURVIVOR DARK › HOUSE OF CARDS › STRANGER THINGS › HOUSE OF CARDS › STRANGER THINGS STRANGER THINGS HOUSE OF CARDS
  • 24.
    Session-based Recommendations Leverage SequentialInformation to Improve Relevance Quadrana et al.: Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks (2017)24
  • 25.
    25 1. Motivation 2. Basicsand Overview 3. Deep Learning for Vehicle Recommendations 4. Scalability and Production Agenda
  • 26.
    Vehicle Recommendations: End-to-EndApproach 26 Candidate Generation Serving Ranking Preprocessing Classifier Training Data 1. 2. 3.
  • 27.
    Vehicle Recommendations: Technologies Frameworksand Hardware for Training and Inference 27
  • 28.
    Vehicle Recommendations: Data 28 Users& Interactions Registered Users Sample Size: 100,000 Users Events: View, Bookmark, Contact Time-based Train-Test-Split CW 14 CW 15 CW 16 CW 17 CW 18 April 2017 May Training Test 85 : 15
  • 29.
    adapted from http://www.kdnuggets.com/2016/02/nine-datasets-investigating-recommender-systems.html SparsityComparison 29 MovieLens 1M: 4.26% MovieLens 20M: 0.53% Last.fm: 0.28% Vehicles All: 0.0046% ~8M interactions between 100k users and 1.7M items
  • 30.
    Approach: Preprocessing (1) 30 Technical §Data Extraction (SQL, HDFS) § Data Type Conversions § User and Item ID Contiguation § Weekly Profile Overlap § User Set Sampling Content-related § Category-based Negative Sampling § Assign Binary Labels {0, 1} § Outlier Removal and Feature Normalization § User Profile Feature Conversion
  • 31.
    Approach: Preprocessing (2) 31 0.40.4 0 0 0.2 0 0 0 ∅ = 9,000€ # = 1,817€ uprice ucolor 8,500€ 7,000€ 10,000€ 7,500€ 12,000€ deterministic stochastic
  • 32.
  • 33.
    33 categorical features many-hot-encoding one-hot-encoding featurevalues ucat icat eclimatisation icont embeddinguser consumption first_reg price... embeddingi, cont ucont embeddingu,cont ... outlier removal z-normalisation ELU (256) ELU (128) ELU (64) embeddingitem ... ... climatisation color ecolor etransmission transmission Probability that user u likes vehicle i meanconsumption meanprice stddevconsumption stddevprice ... concatenateconcatenate PreprocessingEmbeddingDeepComponent outlier removal z-normalisation
  • 34.
    34 categorical features many-hot-encoding one-hot-encoding featurevalues ucat icat eclimatisation icont embeddinguser consumption first_reg price... embeddingi, cont ucont embeddingu,cont ... outlier removal z-normalisation ELU (256) ELU (128) ELU (64) embeddingitem ... ... climatisation color ecolor etransmission transmission Probability that user u likes vehicle i meanconsumption meanprice stddevconsumption stddevprice ... concatenateconcatenate PreprocessingEmbeddingDeepComponent outlier removal z-normalisation UserNet ItemNet RankNet
  • 35.
    minimize minimize Adam Optimizer: StochasticGradient Descent with adaptive learning rate and adaptive momentum Approach: Classifier Training 35 35 RankNet eu u UserNet ei i ItemNet p ( i | u ) class_loss sim_loss Adam Optimizer Adam Optimizer
  • 36.
  • 37.
    https://erikbern.com/2016/06/02/approximate-nearest-news.html Candidate Generation Apply ApproximateNearest Neighbor Search to Embeddings 37 x1 x2 5 approximate itemnearest neighbors search user embedding
  • 38.
    Intuition: Embedding SimilarityRegularization 38 x1 x2 x3 x1 x2 u i euei embedding ✓ ✘ ⍺ ⍺
  • 39.
    Vehicle Recommendations: Ranking RankCandidates by Descending Interaction Probability p(i|u) 39 … ~ 1.7 M Vehicles 1. 2. 3. 1. 2. 3. RankNet
  • 40.
    Vehicle Recommendations: Serving PresentTop-k Recommendations to the User 40 1. 2. 3.
  • 41.
    41 Recommendation Channels Main PageFavorites Similar Vehicles
  • 42.
    Vehicle Recommendations: End-to-EndApproach 42 Candidate Generation Serving Ranking Preprocessing Classifier Training Data 1. 2. 3. ✓ ✓ ✓ ✓ ✓
  • 43.
    Results: DLRS RecommendationRelevance 43 MAP: mean average precision, comparative results after optimization of hyperparameters 0,20% 0,30% 0,40% 0,50% 0,60% 0,70% 0,80% 0,90% 1,00% 1,10% k = 1 k = 5 k = 10 k = 30 k = 100 MAP@k Deep Learning Hybrid CF-CBF (d=700) CF (d=100) 1.10% 1.00% 0.90% 0.80% 0.70% 0.60% 0.50% 0.40% 0.30% 0.20% " +73% +143%
  • 44.
    44 1. Motivation 2. Basicsand Overview 3. Deep Learning for Vehicle Recommendations 4. Scalability and Production Agenda
  • 45.
    Deploying Vehicle Recommendationsat Scale 45 item storage embeddings RankNet UserNet ItemNet ANNOY ANN index Candidate ServiceRanking Service Webservice User Profile API Recommendation Service k recommendations rank candidates {ei} for eu get u get eu get T candidates {ei} get i get ei index all ei ANN search
  • 46.
    46 Deep Learning Solved– What’s next? http://dlrs-workshop.org/wp-content/uploads/2018/10/dlrs2018_welcome.pdf
  • 47.
    47 Sequence-based und Sequence-aware Causal Inference (Deep)Reinforcement Learning Current Trends in Recommender Systems Research
  • 48.
    48 "We can onlysee a short distance ahead, but we can see plenty there that needs to be done." - Alan Turing
  • 49.
    Thank You Marcel Kurovski DataScientist inovex GmbH Kupferhütte 1.13, Schanzenstr. 6-20 51063 Cologne marcel.kurovski@inovex.de +49 173 3181 088 Dr. Florian Wilhelm Principal Data Scientist Julian Hatzky Data Science Working Student
  • 50.
    References 50 [1] Quadrana, Massimo,Karatzoglou, Alexandros, Hidasi, Balázs, Cremonesi, Paolo. “Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks“ Proceedings of the 11th ACM Conference on Recommender Systems. 2017 [2] Cheng, Heng-Tze, et al. "Wide & deep learning for recommender systems." Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 2016. [3] Covington, Paul, Jay Adams, and Emre Sargin. "Deep neural networks for youtube recommendations." Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 2016. [4] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016. [5] Heaton, Jeff. Artificial Intelligence for Humans: Deep Learning and Neural Networks. 2015. [6] Ricci, Francesco and Rokach, Lior and Shapira, Bracha. Recommender Systems Handbook. Springer-Verlag. 2015 [7] Reinsel, David, Gantz, John, Rydning, John. “Data Age 2025: The Evolution of Data to Life-Critical Don't Focus on Big Data; Focus on the Data That's Big“ International Data Corporation (IDC). 2017 [8] Gomez-Uribe, Carlos A. and Hunt, Neil: The Netflix Recommender System: Algorithms, Business Value, and Innovation. 2015 [9] JP Mangalindan: Amazon's recommendation secret. 2012 [10] Christ Johnson: Algorithmis Music Discovery at Spotify. 2014 [11] Maya Hristakeva: Overview of Recommender Algorithms - Part 2. 2015 [12] Alex Gude: The Nine Must-Have Datasets for Investigating Recommender Systems. 2016 [13] Erik Bernhardsson: Approximate nearest news. 2016 [14] Balász Hidasi. 3rd Workshop on Deep Learning for Recommender Systems. 2018 [15] CartStack LLC: Comparison could be killing your online business. 2017 [16] Marina Zayats: “It‘s not information overload; it‘s filter failure.“ Productivity in the Industry 4.0. 2016
  • 51.
    References – Wantto read more? 51 https://bit.ly/2WuS4Zq
  • 52.
  • 53.
    Thank You! Questionsor Comments? 53