Deep Learning
Recommender
Systems
from Prototype to Production
DataTalks.Club 23.03.21
Outline
● About us
● Modern Recommender Systems
● Deep Learning Recommender Systems
● Neural Item Embeddings
● Similarity Search
● Proving value through Experimentation
● From POC to PRD
● Summary
About us
ILIA IVANOV
Data Scientist
ilia.ivanov@olx.com
CRISTIAN MARTINEZ
Lead Data Scientist
cristian.martinez@olx.com
Modern Recommender
Systems
Almost all
modern
platforms that we
use daily have
some kind of
recommender
system
Amazon
Amazon researchers
found that using neural
networks to generate
movie recommendations
worked much better when
they sorted the input data
chronologically and used
it to predict future movie
preferences over a short
(one- to two-week)
period.
Source: https://www.amazon.science/the-history-of-amazons-recommendation-algorithm
Deep Learning Recommender
Systems (DLRS)
Content-based
Deep content-based music recommendation (Oord et al, 2013)
● Convolutional neural nets
● Key ideas:
○ Extract latent representations
of songs from audio signals
○ Train CNN network to generate
embeddings of songs
○ The input of the CNN is a
time-frequency representation
of the audio
Neural Item Embeddings
E-commerce in Your Inbox (Grbovic et al, 2015)
● Inspired by word2vec architecture
● Key ideas:
○ Generate embeddings of products in a
word2vec fashion
○ “Words” are products and “sentences”
are purchase sequences of a user
○ Learn product embeddings using
skip-gram
Wide & Deep
Wide & Deep Learning for Recommender Systems (Cheng et al, 2016)
● Wide and deep feed-forward architectures
● Key ideas:
○ Jointly train two neural
networks.
○ Deep neural network
trained with embeddings.
○ (Wide) Linear model with
feature transformations
for generic recommender
systems with sparse inputs.
Session-based
Session-based recommendations with RNNs (Hidasi et al, 2016)
● Recurrent neural network architectures
● Key ideas:
○ Train using sequence of items in a session
○ Predict: next item in the session
○ Pairwise (ranking) loss functions
■ Bayesian personalised ranking (BPR)
Graph-based
Graph Convolutional Neural Networks for Web-Scale Recommender
Systems (Ying et al, 2018)
● GCN architectures
● Key ideas:
○ Combines efficient random
walks and graph
convolutions to
generate embeddings
of nodes (e.g. items)
○ Improve quality of item
embedding by extracting
Information from its
neighbor nodes.
Benefits of using DNNs
● Extract deep features directly from content
○ e.g. CNNs for images, LSTMs for text
● Allow us to combine information extracted from
different input sources
● Generate more accurate representations of users
and items
● Allow us to use custom loss functions
Neural Item Embeddings
ITEM
EMBEDDINGS
2018
Deep neural network marketplace
recommenders in online experiments
Combine content-based features with user
behaviors to solve the cold start challenge of
collaborative filtering.
2019
How we use item2vec to recommend similar
products
Avito’s item2vec implementation.
+30% contacts from i2i recommendations.
+20% contacts from personal recommendations.
2015
E-commerce in Your Inbox: Product
Recommendations at Scale
Use word2vec algorithm to learn embedding
representations of products.
2016
Meta-Prod2Vec: Product Embeddings Using
Side-Information for Recommendation
Prod2vec + meta-data about the item.
text
location
params
category
item neural network
LSTM layer
word
embeddings
Concatenate
Embedding
layer
Embedding
layer
Dense
layer
Dropout Dense
(ReLU)
Dense
(Tanh)
Embedding space
Similarity Assumption
time
Pair 1 Pair 2 Pair 3
<A, P>
<A, N>
Anchor
cross_entropy(
[<A,P>, <A,N>, ...],
[1 , 0 , …]
)
Positive
Negative
Dot Product Loss Function
Embedding space
Items Dataset
References
● E-commerce in Your Inbox: Product Recommendations at
Scale (Grbovic et al, 2016)
● Meta-Prod2Vec: Product Embeddings Using
Side-Information for Recommendation (Vasile et al, 2016)
● Deep neural network marketplace recommenders in
online experiments (Eide et al, 2018)
● Как мы используем item2vec для рекомендаций похожих
товаров Avito’s item2vec (Russian language)
Similarity Search
Search
Problem: find the nearest neighbors among millions of item embeddings
Solution: approximate nearest neighbor (ANN)
Implementations:
● FAISS (FB)
● Annoy (Spotify)
● NMSLIB
● Open Distro for ES
Experimentation
Off-line
Model
Quality
Development
Speed
loss function
generic
feature encoders
automated
feature selection
domain-specific
data (& model)
data cleaning
Pre-online
Internal testing framework to
● compare recommendations
● debug
On-line
-18%
+0.6%
+16%
+9%
Iterate over 1 model in the corresponding domain
Roll-out to more domains & improve incrementally
Contacts effect over time
From POC to PRD
Automatization
AWS SageMaker AWS Batch Airflow
Monitoring
Training Inference Business
Summary
Lessons Learned
● DL + RS = ❤
● Start simple + iterate
● Learn from your users (A/B-testing)
Q&A
Thank you!

Deep Learning Recommender Systems

  • 1.
    Deep Learning Recommender Systems from Prototypeto Production DataTalks.Club 23.03.21
  • 2.
    Outline ● About us ●Modern Recommender Systems ● Deep Learning Recommender Systems ● Neural Item Embeddings ● Similarity Search ● Proving value through Experimentation ● From POC to PRD ● Summary
  • 3.
    About us ILIA IVANOV DataScientist ilia.ivanov@olx.com CRISTIAN MARTINEZ Lead Data Scientist cristian.martinez@olx.com
  • 5.
  • 6.
    Almost all modern platforms thatwe use daily have some kind of recommender system
  • 7.
    Amazon Amazon researchers found thatusing neural networks to generate movie recommendations worked much better when they sorted the input data chronologically and used it to predict future movie preferences over a short (one- to two-week) period. Source: https://www.amazon.science/the-history-of-amazons-recommendation-algorithm
  • 8.
  • 9.
    Content-based Deep content-based musicrecommendation (Oord et al, 2013) ● Convolutional neural nets ● Key ideas: ○ Extract latent representations of songs from audio signals ○ Train CNN network to generate embeddings of songs ○ The input of the CNN is a time-frequency representation of the audio
  • 10.
    Neural Item Embeddings E-commercein Your Inbox (Grbovic et al, 2015) ● Inspired by word2vec architecture ● Key ideas: ○ Generate embeddings of products in a word2vec fashion ○ “Words” are products and “sentences” are purchase sequences of a user ○ Learn product embeddings using skip-gram
  • 11.
    Wide & Deep Wide& Deep Learning for Recommender Systems (Cheng et al, 2016) ● Wide and deep feed-forward architectures ● Key ideas: ○ Jointly train two neural networks. ○ Deep neural network trained with embeddings. ○ (Wide) Linear model with feature transformations for generic recommender systems with sparse inputs.
  • 12.
    Session-based Session-based recommendations withRNNs (Hidasi et al, 2016) ● Recurrent neural network architectures ● Key ideas: ○ Train using sequence of items in a session ○ Predict: next item in the session ○ Pairwise (ranking) loss functions ■ Bayesian personalised ranking (BPR)
  • 13.
    Graph-based Graph Convolutional NeuralNetworks for Web-Scale Recommender Systems (Ying et al, 2018) ● GCN architectures ● Key ideas: ○ Combines efficient random walks and graph convolutions to generate embeddings of nodes (e.g. items) ○ Improve quality of item embedding by extracting Information from its neighbor nodes.
  • 14.
    Benefits of usingDNNs ● Extract deep features directly from content ○ e.g. CNNs for images, LSTMs for text ● Allow us to combine information extracted from different input sources ● Generate more accurate representations of users and items ● Allow us to use custom loss functions
  • 15.
  • 16.
  • 17.
    2018 Deep neural networkmarketplace recommenders in online experiments Combine content-based features with user behaviors to solve the cold start challenge of collaborative filtering. 2019 How we use item2vec to recommend similar products Avito’s item2vec implementation. +30% contacts from i2i recommendations. +20% contacts from personal recommendations. 2015 E-commerce in Your Inbox: Product Recommendations at Scale Use word2vec algorithm to learn embedding representations of products. 2016 Meta-Prod2Vec: Product Embeddings Using Side-Information for Recommendation Prod2vec + meta-data about the item.
  • 18.
    text location params category item neural network LSTMlayer word embeddings Concatenate Embedding layer Embedding layer Dense layer Dropout Dense (ReLU) Dense (Tanh) Embedding space
  • 19.
  • 20.
    <A, P> <A, N> Anchor cross_entropy( [<A,P>,<A,N>, ...], [1 , 0 , …] ) Positive Negative Dot Product Loss Function
  • 21.
  • 22.
    References ● E-commerce inYour Inbox: Product Recommendations at Scale (Grbovic et al, 2016) ● Meta-Prod2Vec: Product Embeddings Using Side-Information for Recommendation (Vasile et al, 2016) ● Deep neural network marketplace recommenders in online experiments (Eide et al, 2018) ● Как мы используем item2vec для рекомендаций похожих товаров Avito’s item2vec (Russian language)
  • 23.
  • 24.
    Search Problem: find thenearest neighbors among millions of item embeddings Solution: approximate nearest neighbor (ANN) Implementations: ● FAISS (FB) ● Annoy (Spotify) ● NMSLIB ● Open Distro for ES
  • 25.
  • 26.
  • 27.
    Pre-online Internal testing frameworkto ● compare recommendations ● debug
  • 28.
    On-line -18% +0.6% +16% +9% Iterate over 1model in the corresponding domain Roll-out to more domains & improve incrementally Contacts effect over time
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
    Lessons Learned ● DL+ RS = ❤ ● Start simple + iterate ● Learn from your users (A/B-testing)
  • 34.