Deep Recommender
Systems
Gabriel Moreira • @gspmoreira
LatAm 2018
Lead Data Scientist Phd. Candidate
Introduction
"We are leaving the Information Age and entering the
Recommendation Age.".
Cris Anderson, "The long tail"
2
Introduction
37% of sales
2/3 watched movies
38% of top news
visualization
Recommender Systems are responsible for:
3
Recommender System Methods
4
Collaborative Filtering based on Matrix Factorization
Going deeper...
Why Deep Learning has a potential for RecSys?
● Feature extraction directly from the content (e.g., image,
text, audio)
● Heterogenous data handled easily
● Dynamic behaviour modeling with RNNs
● More accurate representation learning of users and items
○ Natural extensions of CF
● RecSys is a complex domain
○ Deep learning worked well in other complex domains
The Deep Learning era of RecSys
2007
2015
2016
2017-2018
Deep Boltzmann Machines
for rating prediction
calm before the
storm
A few seminal papers
First DLRS workshop and
papers on RecSys, KDD,
SIGIR
Continued increase
Research directions in DL-RecSys
And their combinations...
Session-based recommendations with
RNNs
Feature Extraction directly from the content
Learning Item embeddings
Deep Collaborative Filtering
Learning Item Embeddings
Item embeddings
● Embedding: a (learned) real value vector representing an entity
● Also known as Latent feature vector / (Latent) representation
● Similar entities’ embeddings are similar
● Use in recommenders:
○ Initialization of item representation in more advanced
algorithms
● Item-to-item recommendations
Learning Item Embeddings
Prod2Vec (Grbovic et. al, 2015)
● Based on word2vec and paragraph2vec
● Skip-gram model on products
○ Input: i-th product purchased by the user
○ Context: the other purchases of the user
● Learning user representation
○ Follows paragraph2vec
○ User embedding added as global context
○ Input: user + products purchased except for
the i-th
○ Target: i-th product purchased by the user
User embeddings for user to produce predictions
prod2vec skip-gram model
Learning Item Embeddings
Feature extraction from content
for Recommender Systems
Feature extraction from unstructured data
Images Text Audio/Music
● CNN
● 1D CNN
● RNNs
● Weighted word
embeddings
● CNN
● RNN
● Initializing
○ Obtain item representation based on metadata
○ Use this representation as initial item features
● Regularizing
○ Obtain metadata based representations
○ The interaction based representation should be close to the metadata
based (regularizing term to loss of this difference)
● Joining
○ o Have the item feature vector be a concatenation
■ Fixed metadata based part
■ Learned interaction based part
Content Features in Hybrid Recommenders
Wide & Deep Learning
Wide & Deep Learning (Cheng et. al, 2016)
● Joint training of two models
○ Deep Neural Network - Focused in generalization
○ Linear Model - Focused in memorization
● Improved online performance
○ +2.9% deep over wide
○ +3.9% deep & wide over wide
Deep Collaborative Filtering
Outbrain Click Prediction - Kaggle competition
Dataset
● Sample of users page views
and clicks during
14 days on June, 2016
● 2 Billion page views
● 17 million click records
● 700 Million unique users
● 560 sites
Can you predict which recommended content each user will click?
18
Wide & Deep Model example
Wide & Deep Model example
Outbrain Click Prediction - Data Model
Numerical
Spatial
Temporal
Categorical
Target
Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
DNNLinearCombinedClassifier estimator
Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
Wide and Deep features
Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
POC Results
Framework / Platform Model Mean Average Precision (MAP)
Vowpal Wabbit Linear 0.6751
Tensorflow / Google ML
Engine
Linear (Wide) 0.6779
Deep model 0.6674
Wide & Deep model 0.6765
Source: https://stuartcove.com/
Going even deeper...
News Recommender Systems
using Deep Learning
News Recommender Systems
News RS Challenges
● Sparse user profiling
(LI et al. , 2011) (LIN et al. , 2014) (PELÁEZ et al. , 2016)
● Fast growing number of items
(PELÁEZ et al. , 2016) (MOHALLICK; ÖZGÖBEK , 2017)
● Accelerated item’s value decay
(DAS et al. , 2007)
● Users’ preferences shift
(PELÁEZ et al. , 2016) (EPURE et al. , 2017)
To investigate, design, implement, and evaluate a
deep learning meta-architecture for news
recommendation, in order to improve the accuracy
of recommendations provided by news portals,
satisfying readers' dynamic information needs in
such a challenging recommendation scenario.
Research Objective
Conceptual model of factors affecting news relevance
News
relevance
Topics Entities Publisher
News static properties
Recency Popularity
News dynamic properties
News article
User
TimeLocation Device
User current context
Long-term
interests
Short-term
interests
Global factors
Season-
ality
User interests
Breaking
events
Popular
Topics
Referrer
Recency factor influence
Article age (days)
Clicks
Meta-Architecture Requirements
● RQ1 - to provide personalized news recommendations in extreme
cold-start scenarios, as most news are fresh and most users cannot be
identified
● RQ2 - to use Deep Learning to automatically learn news
representations from textual content and news metadata, minimizing
the need of manual feature engineering
● RQ3 - to leverage the user session information, as the sequence of
interacted news may indicate the user’s short-term preferences for
session-based recommendations
● RQ4 - to leverage user's past sessions information, when available, to
model long-term interests for session-aware recommendations
Meta-Architecture Requirements
● RQ5 - to leverage users’ contextual information as a rich data source,
in such information scarcity about the user
● RQ6 - to model explicitly contextual news properties – popularity and
recency – as those are important factors on news interest life cycle
● RQ7 - to support an increasing number of new items and users by
incremental model retraining (online learning), without the need to
retrain on the whole historical dataset
● RQ8 - to provide a modular structure for news recommendation,
allowing its modules to be instantiated by different and increasingly
advanced neural network architectures and methods
CHAMELEON Meta-Architecture for News RS
Article
Context
Article
Content
Embeddings
Article Content Representation (ACR)
Textual Features Representation (TFR)
Metadata Prediction (MP)
Category Tags Entities
Article Metadata Attributes
Next-Article Recommendation (NAR)
Time
Location
Device
When a news article is published...
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
Article
Content
Embedding
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Content word embeddings
New York is a multicultural city , ...Publisher
Metadata
Attributes
News Article
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributes
Article Content Embedding
Legend:
Word
Embeddings
CHAMELEON - ACR module
Article
Content
Embeddings
Article Content Representation (ACR)
Textual Features Representation (TFR)
Metadata Prediction (MP)
Category Tags Entities
Article Metadata Attributes
When a news article is published...
Content word embeddings
New York is a multicultural city , ...Publisher
Metadata
Attributes
News Article
Module Sub-Module EmbeddingInput Output Data repositoryAttributes
Article Content Embedding
Legend:
Word
Embeddings
T-SNE viz. of articles embeddings colored by
category, with similar articles highlighted
“Sports”
“Berlin”
Articles Embeddings Similarity Evaluation (e.g. “sports”)
Article Content (Title | Kicker | Description) Similarity
Kommentar: Generation Mut | Kommentar von Alexander Wölffing zur
Handball-Nationalmannschaft bei der EM | Der offensive Ansatz des Deutschen Handball-Bundes
um Bob Hanning zahlt sich aus. Alexander Wölffing, stellvertretender Chefredakteur und Leiter
Sports bei SPORT1,
kommentiert.
-
”Seahawks werden Brady bearbeiten” | NFL-Legende Hines Ward analysiert den Super Bowl | Hines
Ward favorisiert Seattle. Der frühere MVP des Endspiels sieht aber Vollmer und Co. sowie einen
X-Faktor als Chance der Patriots. Für SPORT1 analysiert er den Super Bowl.
0.970
Bald kein ”Hack-a-Drummond” mehr? | Foul als taktisches Mittel: NBA will ”hack-a-player”-Regel
¨ander | Andre rummond und alle schlechten Freiwerfer in der NBA wird es freuen: Die Liga will
Fouls als taktisches Mittel anders bestrafen, um den Sport nicht kaputt zu machen.
0.964
Nummer 2: Leno findet sich damit ab | Bernd Leno spricht im SPORT1- Interview uber Manuel
Neuer ¨ | Bayer Leverkusens Bernd Leno nimmt bei SPORT1 Stellung zur Torhuter-Situation im
DFB-Team sowie Vorbild ¨ Manuel Neuer - und spricht auch uber Fehler und Frust.
0.962
33
CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
34
CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Sessions in a batch
Article
Content
Embedding
35
I1,1
I1,2
I1,3
I1,4
I1,5
I2,1
I2,2
I3,1
I3,2
I3,3
Input
I1,1
I1,2
I1,3
I1,4
I1,2
I1,3
I1,4
I1,5
Expected Output (labels)
CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
36
Negative Sampling strategy
Articles read by users overall n the last hour
Buffer
Articles read in other user sessions in the
batch
Next article read by user in his session
Samples
CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
37
Recommendations Ranking
(RR) sub-module
Eq. 7 - Loss function (HUANG et al., 2013)
Eq. 4 - Relevance Score of an item for a user session
Eq. 5 - Cosine similarity
Eq. 6 - Softmax over Relevance Score (HUANG et al., 2013)
CHAMELEON - NAR module loss function
Recommendation loss function implemented on TensorFlow
● Predict the next article user will read in a session
● Training with interactions up to an hour, and evaluate the predictions of
next-item clicks for the next hour sessions
● Ranking Metrics
○ Recall@3 - Scores when the actually clicked article is among the top-3
items in the recommended ranking list
○ NDCG@3 - Takes into account the predicted position of actually clicked
item (e.g. in position 1 it would score higher than on position 2)
● Evaluation Benchmarks
○ Popular Recent - Keeps a buffer of all clicks of the last hour, and return
the most popular articles on that time window
○ Co-occurrent - Recommends articles that are most commonly read
together on user sessions
○ Content-Based - Return articles whose content embeddings are the
most similar
CHAMELEON - Offline evaluation protocol
CHAMELEON - Preliminary Results
Recall@3 distribution over time
CHAMELEON - Preliminary Results
NDCG@3 distribution over time
Recall@3 distribution
CHAMELEON - Preliminary Results
60,5%
66,5%
57%
CHAMELEON - Preliminary Results
Recall@3 by hour
CHAMELEON - Preliminary Results
Recall@3 by day
● ACR module
○ Try different word embeddings (Word2Vec, GloVe)
○ Try different textual feature extractors (CNNs, RNNs) to generate
articles embeddings
● NAR module
○ Test different RNN architectures (LSTM, GRU)
○ Try different negative sampling strategies
○ Leverage users past sessions to initialize RNNs for new sessions
○ Evaluate on different (and larger) news datasets
○ Hyperparameters tuning
Next steps
https://www.slideshare.net/balazshidasi/deep-learning-in-recommender-systems-recsys-summer-school-2017
References
https://github.com/datadynamo/strata_ny_2017_recommender_tutorial
Resources
Resources
Deskdrop dataset shared on Kaggle
bit.ly/dataset4recsys
● 12 months logs
(Mar. 2016 - Feb. 2017)
● ~ 73k user interactions
● ~ 3k shared articles
in the platform
Gabriel Moreira
Lead Data Scientist
gabrielpm@ciandt.com
@gspmoreira
We are hiring!
http://bit.ly/ds4ciandt

Deep Recommender Systems - PAPIs.io LATAM 2018

  • 1.
    Deep Recommender Systems Gabriel Moreira• @gspmoreira LatAm 2018 Lead Data Scientist Phd. Candidate
  • 2.
    Introduction "We are leavingthe Information Age and entering the Recommendation Age.". Cris Anderson, "The long tail" 2
  • 3.
    Introduction 37% of sales 2/3watched movies 38% of top news visualization Recommender Systems are responsible for: 3
  • 4.
  • 5.
    Collaborative Filtering basedon Matrix Factorization
  • 6.
  • 7.
    Why Deep Learninghas a potential for RecSys? ● Feature extraction directly from the content (e.g., image, text, audio) ● Heterogenous data handled easily ● Dynamic behaviour modeling with RNNs ● More accurate representation learning of users and items ○ Natural extensions of CF ● RecSys is a complex domain ○ Deep learning worked well in other complex domains
  • 8.
    The Deep Learningera of RecSys 2007 2015 2016 2017-2018 Deep Boltzmann Machines for rating prediction calm before the storm A few seminal papers First DLRS workshop and papers on RecSys, KDD, SIGIR Continued increase
  • 9.
    Research directions inDL-RecSys And their combinations... Session-based recommendations with RNNs Feature Extraction directly from the content Learning Item embeddings Deep Collaborative Filtering
  • 10.
  • 11.
    Item embeddings ● Embedding:a (learned) real value vector representing an entity ● Also known as Latent feature vector / (Latent) representation ● Similar entities’ embeddings are similar ● Use in recommenders: ○ Initialization of item representation in more advanced algorithms ● Item-to-item recommendations Learning Item Embeddings
  • 12.
    Prod2Vec (Grbovic et.al, 2015) ● Based on word2vec and paragraph2vec ● Skip-gram model on products ○ Input: i-th product purchased by the user ○ Context: the other purchases of the user ● Learning user representation ○ Follows paragraph2vec ○ User embedding added as global context ○ Input: user + products purchased except for the i-th ○ Target: i-th product purchased by the user User embeddings for user to produce predictions prod2vec skip-gram model Learning Item Embeddings
  • 13.
    Feature extraction fromcontent for Recommender Systems
  • 14.
    Feature extraction fromunstructured data Images Text Audio/Music ● CNN ● 1D CNN ● RNNs ● Weighted word embeddings ● CNN ● RNN
  • 15.
    ● Initializing ○ Obtainitem representation based on metadata ○ Use this representation as initial item features ● Regularizing ○ Obtain metadata based representations ○ The interaction based representation should be close to the metadata based (regularizing term to loss of this difference) ● Joining ○ o Have the item feature vector be a concatenation ■ Fixed metadata based part ■ Learned interaction based part Content Features in Hybrid Recommenders
  • 16.
    Wide & DeepLearning
  • 17.
    Wide & DeepLearning (Cheng et. al, 2016) ● Joint training of two models ○ Deep Neural Network - Focused in generalization ○ Linear Model - Focused in memorization ● Improved online performance ○ +2.9% deep over wide ○ +3.9% deep & wide over wide Deep Collaborative Filtering
  • 18.
    Outbrain Click Prediction- Kaggle competition Dataset ● Sample of users page views and clicks during 14 days on June, 2016 ● 2 Billion page views ● 17 million click records ● 700 Million unique users ● 560 sites Can you predict which recommended content each user will click? 18 Wide & Deep Model example
  • 19.
    Wide & DeepModel example Outbrain Click Prediction - Data Model Numerical Spatial Temporal Categorical Target
  • 20.
    Wide & DeepModel example Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine DNNLinearCombinedClassifier estimator
  • 21.
    Wide & DeepModel example Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine Wide and Deep features
  • 22.
    Wide & DeepModel example Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine POC Results Framework / Platform Model Mean Average Precision (MAP) Vowpal Wabbit Linear 0.6751 Tensorflow / Google ML Engine Linear (Wide) 0.6779 Deep model 0.6674 Wide & Deep model 0.6765
  • 23.
  • 24.
  • 25.
    News Recommender Systems NewsRS Challenges ● Sparse user profiling (LI et al. , 2011) (LIN et al. , 2014) (PELÁEZ et al. , 2016) ● Fast growing number of items (PELÁEZ et al. , 2016) (MOHALLICK; ÖZGÖBEK , 2017) ● Accelerated item’s value decay (DAS et al. , 2007) ● Users’ preferences shift (PELÁEZ et al. , 2016) (EPURE et al. , 2017)
  • 26.
    To investigate, design,implement, and evaluate a deep learning meta-architecture for news recommendation, in order to improve the accuracy of recommendations provided by news portals, satisfying readers' dynamic information needs in such a challenging recommendation scenario. Research Objective
  • 27.
    Conceptual model offactors affecting news relevance News relevance Topics Entities Publisher News static properties Recency Popularity News dynamic properties News article User TimeLocation Device User current context Long-term interests Short-term interests Global factors Season- ality User interests Breaking events Popular Topics Referrer
  • 28.
  • 29.
    Meta-Architecture Requirements ● RQ1- to provide personalized news recommendations in extreme cold-start scenarios, as most news are fresh and most users cannot be identified ● RQ2 - to use Deep Learning to automatically learn news representations from textual content and news metadata, minimizing the need of manual feature engineering ● RQ3 - to leverage the user session information, as the sequence of interacted news may indicate the user’s short-term preferences for session-based recommendations ● RQ4 - to leverage user's past sessions information, when available, to model long-term interests for session-aware recommendations
  • 30.
    Meta-Architecture Requirements ● RQ5- to leverage users’ contextual information as a rich data source, in such information scarcity about the user ● RQ6 - to model explicitly contextual news properties – popularity and recency – as those are important factors on news interest life cycle ● RQ7 - to support an increasing number of new items and users by incremental model retraining (online learning), without the need to retrain on the whole historical dataset ● RQ8 - to provide a modular structure for news recommendation, allowing its modules to be instantiated by different and increasingly advanced neural network architectures and methods
  • 31.
    CHAMELEON Meta-Architecture forNews RS Article Context Article Content Embeddings Article Content Representation (ACR) Textual Features Representation (TFR) Metadata Prediction (MP) Category Tags Entities Article Metadata Attributes Next-Article Recommendation (NAR) Time Location Device When a news article is published... User context User interaction past read articles Popularity Recency Article context Users Past Sessions Article Content Embedding candidate next articles (positive and neg.) active article Active Sessions When a user reads a news article... Predicted Next-Article Embedding Session Representation (SR) Recommendations Ranking (RR) User-Personalized Contextual Article Embedding Recommended articles Contextual Article Representation (CAR) Content word embeddings New York is a multicultural city , ...Publisher Metadata Attributes News Article Active user session Module Sub-Module EmbeddingInput Output Data repositoryAttributes Article Content Embedding Legend: Word Embeddings
  • 32.
    CHAMELEON - ACRmodule Article Content Embeddings Article Content Representation (ACR) Textual Features Representation (TFR) Metadata Prediction (MP) Category Tags Entities Article Metadata Attributes When a news article is published... Content word embeddings New York is a multicultural city , ...Publisher Metadata Attributes News Article Module Sub-Module EmbeddingInput Output Data repositoryAttributes Article Content Embedding Legend: Word Embeddings T-SNE viz. of articles embeddings colored by category, with similar articles highlighted “Sports” “Berlin”
  • 33.
    Articles Embeddings SimilarityEvaluation (e.g. “sports”) Article Content (Title | Kicker | Description) Similarity Kommentar: Generation Mut | Kommentar von Alexander Wölffing zur Handball-Nationalmannschaft bei der EM | Der offensive Ansatz des Deutschen Handball-Bundes um Bob Hanning zahlt sich aus. Alexander Wölffing, stellvertretender Chefredakteur und Leiter Sports bei SPORT1, kommentiert. - ”Seahawks werden Brady bearbeiten” | NFL-Legende Hines Ward analysiert den Super Bowl | Hines Ward favorisiert Seattle. Der frühere MVP des Endspiels sieht aber Vollmer und Co. sowie einen X-Faktor als Chance der Patriots. Für SPORT1 analysiert er den Super Bowl. 0.970 Bald kein ”Hack-a-Drummond” mehr? | Foul als taktisches Mittel: NBA will ”hack-a-player”-Regel ¨ander | Andre rummond und alle schlechten Freiwerfer in der NBA wird es freuen: Die Liga will Fouls als taktisches Mittel anders bestrafen, um den Sport nicht kaputt zu machen. 0.964 Nummer 2: Leno findet sich damit ab | Bernd Leno spricht im SPORT1- Interview uber Manuel Neuer ¨ | Bayer Leverkusens Bernd Leno nimmt bei SPORT1 Stellung zur Torhuter-Situation im DFB-Team sowie Vorbild ¨ Manuel Neuer - und spricht auch uber Fehler und Frust. 0.962 33
  • 34.
    CHAMELEON - NARmodule Article Context Article Content Embeddings Next-Article Recommendation (NAR) Time Location Device User context User interaction past read articles Popularity Recency Article context Users Past Sessions candidate next articles (positive and neg.) active article Active Sessions When a user reads a news article... Predicted Next-Article Embedding Session Representation (SR) Recommendations Ranking (RR) User-Personalized Contextual Article Embedding Recommended articles Contextual Article Representation (CAR) Active user session Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend: Article Content Embedding 34
  • 35.
    CHAMELEON - NARmodule Article Context Article Content Embeddings Next-Article Recommendation (NAR) Time Location Device User context User interaction past read articles Popularity Recency Article context Users Past Sessions candidate next articles (positive and neg.) active article Active Sessions When a user reads a news article... Predicted Next-Article Embedding Session Representation (SR) Recommendations Ranking (RR) User-Personalized Contextual Article Embedding Recommended articles Contextual Article Representation (CAR) Active user session Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend: Sessions in a batch Article Content Embedding 35 I1,1 I1,2 I1,3 I1,4 I1,5 I2,1 I2,2 I3,1 I3,2 I3,3 Input I1,1 I1,2 I1,3 I1,4 I1,2 I1,3 I1,4 I1,5 Expected Output (labels)
  • 36.
    CHAMELEON - NARmodule Article Context Article Content Embeddings Next-Article Recommendation (NAR) Time Location Device User context User interaction past read articles Popularity Recency Article context Users Past Sessions candidate next articles (positive and neg.) active article Active Sessions When a user reads a news article... Predicted Next-Article Embedding Session Representation (SR) Recommendations Ranking (RR) User-Personalized Contextual Article Embedding Recommended articles Contextual Article Representation (CAR) Active user session Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend: Article Content Embedding 36 Negative Sampling strategy Articles read by users overall n the last hour Buffer Articles read in other user sessions in the batch Next article read by user in his session Samples
  • 37.
    CHAMELEON - NARmodule Article Context Article Content Embeddings Next-Article Recommendation (NAR) Time Location Device User context User interaction past read articles Popularity Recency Article context Users Past Sessions candidate next articles (positive and neg.) active article Active Sessions When a user reads a news article... Predicted Next-Article Embedding Session Representation (SR) Recommendations Ranking (RR) User-Personalized Contextual Article Embedding Recommended articles Contextual Article Representation (CAR) Active user session Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend: Article Content Embedding 37 Recommendations Ranking (RR) sub-module Eq. 7 - Loss function (HUANG et al., 2013) Eq. 4 - Relevance Score of an item for a user session Eq. 5 - Cosine similarity Eq. 6 - Softmax over Relevance Score (HUANG et al., 2013)
  • 38.
    CHAMELEON - NARmodule loss function Recommendation loss function implemented on TensorFlow
  • 39.
    ● Predict thenext article user will read in a session ● Training with interactions up to an hour, and evaluate the predictions of next-item clicks for the next hour sessions ● Ranking Metrics ○ Recall@3 - Scores when the actually clicked article is among the top-3 items in the recommended ranking list ○ NDCG@3 - Takes into account the predicted position of actually clicked item (e.g. in position 1 it would score higher than on position 2) ● Evaluation Benchmarks ○ Popular Recent - Keeps a buffer of all clicks of the last hour, and return the most popular articles on that time window ○ Co-occurrent - Recommends articles that are most commonly read together on user sessions ○ Content-Based - Return articles whose content embeddings are the most similar CHAMELEON - Offline evaluation protocol
  • 42.
    CHAMELEON - PreliminaryResults Recall@3 distribution over time
  • 43.
    CHAMELEON - PreliminaryResults NDCG@3 distribution over time
  • 44.
    Recall@3 distribution CHAMELEON -Preliminary Results 60,5% 66,5% 57%
  • 45.
    CHAMELEON - PreliminaryResults Recall@3 by hour
  • 46.
    CHAMELEON - PreliminaryResults Recall@3 by day
  • 47.
    ● ACR module ○Try different word embeddings (Word2Vec, GloVe) ○ Try different textual feature extractors (CNNs, RNNs) to generate articles embeddings ● NAR module ○ Test different RNN architectures (LSTM, GRU) ○ Try different negative sampling strategies ○ Leverage users past sessions to initialize RNNs for new sessions ○ Evaluate on different (and larger) news datasets ○ Hyperparameters tuning Next steps
  • 48.
  • 49.
  • 50.
    Resources Deskdrop dataset sharedon Kaggle bit.ly/dataset4recsys ● 12 months logs (Mar. 2016 - Feb. 2017) ● ~ 73k user interactions ● ~ 3k shared articles in the platform
  • 51.
    Gabriel Moreira Lead DataScientist gabrielpm@ciandt.com @gspmoreira We are hiring! http://bit.ly/ds4ciandt