This document provides an overview of deep recommender systems and some of their shortcomings. It discusses neural network architectures like NeuMF, Wide&Deep, Neural FM, DeepFM, and DSCF that have been applied to recommendation. It also covers sequential recommendation methods, optimization techniques, and challenges like short-term rewards, manually designed architectures, isolated data, and security issues like poisoning attacks.
1. Fundamentals of Deep
Recommender Systems
1
Data Science and Engineering Lab
Tutorial website: https://advanced-recommender-systems.github.io/ijcai2021-tutorial/
Wenqi Fan
The Hong Kong Polytechnic University
https://wenqifan03.github.io, wenqifan@polyu.edu.hk
2. 2
A General Architecture of Deep Recommender System
Embedding layer
Prediction layer
0
0 1
Field 1 Field m Field M
0
1 0 1
0 0
User Item Context Interaction
Hidden layers
(e.g., MLP, CNN, RNN, etc. )
3. 3
NeuMF
Neural Collaborative Filtering, WWW, 2017
NeuMF unifies the strengths of MF and MLP in modeling user-item interactions.
p MF uses an inner product as the interaction function
p MLP is more sufficient to capture the complex structure of user interaction data
4. 4
Wide&Deep
Wide & Deep Learning for Recommender Systems, 1st DLRS, 2016
Standard MLP
Embedding
Concatenation
p The wide linear models can memorize seen feature interactions using cross-product feature
transformations.
p The deep models can generalize to previously unseen feature interactions through low- dimensional
embeddings.
5. 5
Neural FM
Neural Factorization Machines for Sparse Predictive Analytics, SIGIR, 2017
Neural Factorization Machines (NFMs) “deepens”
FM by placing hidden layers above second-order
feature interaction modeling.
categorical variables
6. 6
Neural FM
Neural Factorization Machines for Sparse Predictive Analytics, SIGIR, 2017
Neural Factorization Machines (NFMs) “deepens”
FM by placing hidden layers above second-order
feature interaction modeling.
“Deep layers” learn higher-order feature
interactions only, being much easier to train.
Bilinear Interaction Pooling:
𝑓!" 𝑉
# = $
$%&
'
$
(%$)&
'
𝑥$ 𝐯$ ⨀𝑥(v( categorical variables
7. 7
DeepFM
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction, IJCAI, 2017
DeepFM ensembles FM and DNN and to low- and high-order feature interactions
simultaneously from the input raw features.
Prediction Model:
FM component (low-order) Deep component (high-order)
8. 8
DeepFM
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction, IJCAI, 2017
DeepFM ensembles FM and DNN and to low- and high-order feature interactions
simultaneously from the input raw features.
FM component (low-order) Deep component (high-order)
Share the
embedding layer
Prediction Model:
9. 9
DSCF
Deep Social Collaborative Filtering, RecSys, 2019
Collaborative Filtering with users’ social relations
(Social Recommendation)
News
US could see millions of coronavirus
cases and 100,000 or more deaths
Dr. Anthony Fauci
10. 10
DSCF
Deep Social Collaborative Filtering, RecSys, 2019
Collaborative Filtering with users’ social relations
(Social Recommendation)
β1
β3
β4
β2
Output Layer
. . .
Embedding Layer
Random Walk Layer
α1
α2
α3
α4
1
h
2
h L
h
L
h
1
-
L
h
2
h 1
-
L
h
1
h
Sequence
Learning Layer
. . .
. . .
. . .
. . .
Attention Network
Attention Network
u[1] u[2] u[l]
r'
…
…
Item Embedding
User Embedding
Rating Embedding
Concatenation
Concatenation
… …
…
Users might be affected by direct/distant neighbors.
Ø Information diffusion
Ø Users with high reputations
News
US could see millions of coronavirus
cases and 100,000 or more deaths
Dr. Anthony Fauci
11. 11
DSCF
Deep Social Collaborative Filtering, RecSys, 2019
Collaborative Filtering with users’ social relations
(Social Recommendation)
β1
β3
β4
β2
Output Layer
. . .
Embedding Layer
Random Walk Layer
α1
α2
α3
α4
1
h
2
h L
h
L
h
1
-
L
h
2
h 1
-
L
h
1
h
Sequence
Learning Layer
. . .
. . .
. . .
. . .
Attention Network
Attention Network
u[1] u[2] u[l]
r'
…
…
Item Embedding
User Embedding
Rating Embedding
Concatenation
Concatenation
… …
…
News
US could see millions of coronavirus
cases and 100,000 or more deaths
Dr. Anthony Fauci
Users might be affected by direct/distant neighbors.
Ø Information diffusion
Ø Users with high reputations
Social Sequences
via Random Walk
techniques
Bi-LSTM with
attention
mechanisms
12. 12
DASO
Deep Adversarial Social Recommendation, IJCAI, 2019
3
5
1
Item Domain Social Domain
p User behave and interact differently in the item/social domains.
Collaborative Filtering with users’ social relations
(Social Recommendation)
13. 13
DASO
Deep Adversarial Social Recommendation, IJCAI, 2019
3
5
1
Item Domain Social Domain
p User behave and interact differently in the item/social domains.
Learning separated user representations in two domains.
Collaborative Filtering with users’ social relations
(Social Recommendation)
14. 14
DASO
Deep Adversarial Social Recommendation, IJCAI, 2019
3
5
1
Item Domain Social Domain
p User behave and interact differently in the item/social domains.
Learning separated user representations in two domains.
Bidirectional Knowledge Transfer with Cycle Reconstruction
Collaborative Filtering with users’ social relations
(Social Recommendation)
15. 15
Optimization for Ranking Tasks
p Negative Sampling’s Main Issue:
• It often generates low-quality negative samples that do not help you learn good
representation.
Deep Adversarial Social Recommendation, IJCAI, 2019
16. 16
Optimization for Ranking Tasks
p Negative Sampling’s Main Issue:
• It often generates low-quality negative samples that do not help you learn good
representation [Cai and Wang, 2018; Wang et al., 2018b].
Dress
Male
Men’s Wallets
Female
informative
‘difficult’
“easy”/unrelated “easy”/unrelated
Deep Adversarial Social Recommendation, IJCAI, 2019
17. 17
Optimization for Ranking Tasks
p Negative Sampling’s Main Issue:
• It often generates low-quality negative samples that do not help you learn good
representation [Cai and Wang, 2018; Wang et al., 2018b].
Dress
Male
Men’s Wallets
Female
informative
‘difficult’
“easy”/unrelated “easy”/unrelated
Dynamically generate
“difficult” negative samples
Optimization with
Adversarial Learning
(GAN)
Deep Adversarial Social Recommendation, IJCAI, 2019
18. p(v|u) p(uk|u)
Loss/Reward
Generator Discriminator
Discriminator Generator
Reward Reward
Generated
Samples
Generated
Samples
Cyclic User
Modeling
Item Domain
Adversarial Learning
UI UIS
USI US
hI->S
hS->I
Loss/Reward
Real
Samples
User-Item
Interactions
User-User
Connections
Real
Samples
Social Domain Representations for
Generator
Social Domain Representations for
Discriminator
Social Domain
Adversarial Learning
V U
u v
U Uk
U Uk
User Representations on Social
Domain after Mapping (I->S)
Item Domain Representations for
Generator
User Representations on Item
Domain after Mapping (S->I)
Item Domain Representations for
Discriminator
18
DASO
Deep Adversarial Social Recommendation, IJCAI, 2019
19. p(v|u) p(uk|u)
Loss/Reward
Generator Discriminator
Discriminator Generator
Reward Reward
Generated
Samples
Generated
Samples
Cyclic User
Modeling
Item Domain
Adversarial Learning
UI UIS
USI US
hI->S
hS->I
Loss/Reward
Real
Samples
User-Item
Interactions
User-User
Connections
Real
Samples
Social Domain Representations for
Generator
Social Domain Representations for
Discriminator
Social Domain
Adversarial Learning
V U
u v
U Uk
U Uk
User Representations on Social
Domain after Mapping (I->S)
Item Domain Representations for
Generator
User Representations on Item
Domain after Mapping (S->I)
Item Domain Representations for
Discriminator
19
DASO
Deep Adversarial Social Recommendation, IJCAI, 2019
20. 20
Item Domain Discriminator Model
p Discriminator
Score function:
Goal: distinguish real user-item pairs (i.e., real samples) and the
generated “fake” samples (relevant)
(Sigmoid)
Deep Adversarial Social Recommendation, IJCAI, 2019
21. 21
Item Domain Generator Model
p Generator Model
Goal:
1. Approximate the underlying real conditional distribution pI
real(v|ui)
2. Generate (select/sample) the most relevant items for any given user ui.
Optimization with
Policy Gradient
the transferred user representation
from social domain
22. 22
Sequential (Session-based) Recommendation
Session-based Recommendations with Recurrent Neural Networks, ICLR, 2016.
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer, CIKM, 2019.
user’s sequential behavior
Next Item
0.8 0.6 0.1
23. 23
Sequential (Session-based) Recommendation
Session-based Recommendations with Recurrent Neural Networks, ICLR, 2016.
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer, CIKM, 2019.
GRU based sequential recommendation method
(GRU4Rec)
user’s sequential behavior
Next Item
0.8 0.6 0.1
24. 24
Sequential (Session-based) Recommendation
Session-based Recommendations with Recurrent Neural Networks, ICLR, 2016.
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer, CIKM, 2019.
GRU based sequential recommendation method
(GRU4Rec)
user’s sequential behavior
Next Item
0.8 0.6 0.1
BERT4Rec
Transformer
Layer
26. 26
Recommendation Policies
§ Offline optimization
§ Short-term reward
Graph-structured Data
§ Information isolated
island Issue: ignore
implicit/explicit relationships
among instances
Shortcomings of Existing Deep Recommender Systems
Items
Users
!! "!
!! ""
!! "#
!" "!
… …
27. 27
Recommendation Policies
§ Offline optimization
§ Short-term reward
0
0 1
Field 1 Field m Field M
0
1 0 1
0 0
Manually Deisgned
Architectures
§ Expert knowledge
§ Time and engineering
efforts
Graph-structured Data
§ Information isolated
island Issue: ignore
implicit/explicit relationships
among instances
Shortcomings of Existing Deep Recommender Systems
Items
Users
!! "!
!! ""
!! "#
!" "!
… …
28. 28
Recommendation Policies
§ Offline optimization
§ Short-term reward
0
0 1
Field 1 Field m Field M
0
1 0 1
0 0
Manually Deisgned
Architectures
§ Expert knowledge
§ Time and engineering
efforts
Graph-structured Data
§ Information isolated
island Issue: ignore
implicit/explicit relationships
among instances
Shortcomings of Existing Deep Recommender Systems
Items
Users
!! "!
!! ""
!! "#
!" "!
… …
...
...
... ... ... ... ...
...
...
...
Normal
Data
Fake
User Profiles
Generating Fake User Profiles
Attacker
promote/demote
Poisoning attacks:
§ Promote/demote items
§ White/grey/black-box
attacks