Deep learning to the rescue - solving long standing problems of recommender systems

Balázs Hidasi
Balázs HidasiHead of Data Mining and Research at Gravity R&D
Deep learning to the rescue
solving long standing problems of recommender systems
Balázs Hidasi
@balazshidasi
Budapest RecSys & Personalization meetup
12 May, 2016
What is deep learning?
• A class of machine learning algorithms
 that use a cascade of multiple non-linear
processing layers
 and complex model structures
 to learn different representations of the
data in each layer
 where higher level features are derived
from lower level features
 to form a hierarchical representation.
• Key component of recent technologies
 Speech recognition
 Personal assistants (e.g. Siri, Cortana)
 Computer vision, object recognition
 Machine translation
 Chatbot technology
 Face recognition
 Self driving cars
• An efficient tool for certain complex
problems
 Pattern recognition
 Computer vision
 Natural language processing
 Speech recognition
• Deep learning is NOT
 the true AI
o it may be a component of it when and
if AI is created
 how the human brain works
 the best solution to every machine
learning tasks
Deep learning in the news
Why is deep learning happening now?
• Actually it is not   first papers published in 1970s
• Third resurgence of neural networks
 Research breakthroughs
 Increase in computational power
 GP GPUs
Problem Solution
Vanishing
gradients
Sigmoid type activation functions easily saturate.
Gradients are small, in deeper layers updates become
almost zero.
Earlier: layer-by-layer pretraining
Recently: non-saturating
activation functions
Gradient
descent
First order methods (e.g. SGD) are easily stuck.
Second order methods are infeasible on larger data.
Adaptive training: adagrad, adam,
adadelta, RMSProp
Nesterov momentum
Regularization Networks easily overfit (even with L2 regularization). Dropout
ETC...
Challenges in RecSys
• Recommender systems ≠ Netflix challenge
 Rating prediction  Top-N recommendation (ranking)
 Explicit feedback  Implicit feedback
 Long user histories  Sessions
 Slowly changing taste  Goal oriented browsing
 Item-to-user only  Other scenarios
• Success of CF
 Human brain is a powerful feature extractor
 Cold-start
o CF can’t be used
o Decisions are rarely made on metadata
o But rather on what the user sees: e.g. product image, content itself
Domain
dependent
Session-based recommendations
• Permanent cold start
 User identification
o Possible but often not reliable
 Intent/theme
o What the user needs?
o Theme of the session
 Never/rarely returning users
• Workaround in practice
 Item-to-item recommendations
o Similar items
o Co-occurring items
 Non-personalized
 Not adaptive
Recurrent Neural Networks
• Hidden state
 Next hidden state depends on the input and the actual hidden state (recurrence)
 ℎ 𝑡 = tanh 𝑊𝑥 𝑡 + 𝑈ℎ 𝑡−1
• „Infinite depth”
• Backpropagation Through Time
• Exploding gradients
 Due to recurrence
 If the spectral radius of U > 1 (necessary)
• Lack of long term memory (vanishing gradients)
 Gradients of earlier states vanish
 If the spectral radius of U < 1 (sufficient)
ℎ
𝑥𝑡 ℎ 𝑡
ℎ
ℎ 𝑡 𝑥𝑡
ℎℎ 𝑡−1
𝑥𝑡−1
ℎℎ 𝑡−2
𝑥𝑡−2
ℎℎ 𝑡−3
𝑥𝑡−3
ℎ 𝑡−4
Advanced RNN units
• Long Short-Term Memory (LSTM)
 Memory cell (𝑐𝑡) is the mix of
o its previous value (governed by the
forget gate (𝑓𝑡))
o the cell value candidate (governed
by the input gate (𝑖 𝑡))
 Cell value candidate ( 𝑐𝑡) depends on
the input and the previous hidden state
 Hidden state is the memory cell
regulated by the output gate (𝑜𝑡)
 No vanishing/exploding gradients
• 𝑓𝑡 𝑖 𝑡 = 𝜎 𝑊 𝑓
𝑊 𝑖
𝑥𝑡 + 𝑈 𝑓
𝑈 𝑖
ℎ 𝑡−1 + 𝑉 𝑓
𝑉 𝑖
𝑐𝑡−1
• 𝑜𝑡 = 𝜎 𝑊 𝑜
𝑥𝑡 + 𝑈 𝑜
ℎ 𝑡−1 + 𝑉 𝑜
𝑐𝑡−1
• 𝑐𝑡 = tanh 𝑊𝑥𝑡 + 𝑈ℎ 𝑡−1
• 𝑐𝑡 = tanh 𝑓𝑡 𝑐𝑡−1 + 𝑖 𝑡 𝑐𝑡
• ℎ 𝑡 = 𝑜𝑡 𝑐𝑡
• Gated Recurrent Unit (GRU)
 Hidden state is the mix of
o the previous hidden state
o the hidden state candidate (ℎ 𝑡)
o governed by the update gate (𝑧𝑡)
– merged input+forget gate
 Hidden state candidate depends on the
input and the previous hidden state
through a reset gate (𝑟𝑡)
 Similar performance
 Less calculations
• 𝑧𝑡 = 𝜎 𝑊 𝑧
𝑥𝑡 + 𝑈 𝑧
ℎ 𝑡−1
• 𝑟𝑡 = 𝜎 𝑊 𝑟
𝑥𝑡 + 𝑈 𝑟
ℎ 𝑡−1
• ℎ 𝑡 = 𝜎 𝑊𝑥𝑡 + 𝑈(𝑟𝑡∘ ℎ 𝑡−1)
• ℎ 𝑡 = 1 − 𝑧𝑡 ℎ 𝑡−1 + 𝑧𝑡ℎ 𝑡
ℎℎ
𝑥𝑡
ℎ 𝑡
𝑧
ℎ𝑐
𝑐
𝑥𝑡
ℎ 𝑡
Powered by RNN
• Sequence labeling
 Document classification
 Speech recognition
• Sequence-to-sequence learning
 Machine translation
 Question answering
 Conversations
• Sequence generation
 Music
 Text
Session modeling with RNNs
• Input: actual item of session
• Output: score on items for being the
next in the event stream
• GRU based RNN
 RNN is worse
 LSTM is slower (same accuracy)
• Optional embedding and feedforward
layers
 Better results without
• Number of layers
 1 gave the best performance
 Sessions span over short timeframes
 No need for modeling on multiple scales
• Requires some adaptation
Feedforward layers
Embedding layers
…
Output: scores on all items
GRU layer
GRU layer
GRU layer
Input: actual item, 1-of-N coding
(optional)
(optional)
Adaptation: session parallel mini-batches
• Motivation
 High variance in the length of the sessions (from 2 to 100s of
events)
 The goal is to capture how sessions evolve
• Minibatch
 Input: current evets
 Output: next events
𝑖1,1 𝑖1,2 𝑖1,3 𝑖1,4
𝑖2,1 𝑖2,2 𝑖2,3
𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6
𝑖4,1 𝑖4,2
𝑖5,1 𝑖5,2 𝑖5,3
Session1
Session2
Session3
Session4
Session5
𝑖1,1 𝑖1,2 𝑖1,3
𝑖2,1 𝑖2,2
𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5
𝑖4,1
𝑖5,1 𝑖5,2
Input
(item of the
actual event)
Desired output
(next item in the
event stream)
…
…
…
Mini-batch1
Mini-batch3
Mini-batch2
𝑖1,2 𝑖1,3 𝑖1,4
𝑖2,2 𝑖2,3
𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6
𝑖4,2
𝑖5,2 𝑖5,3 …
…
…
• Active sessions
 First X
 Finished sessions
replaced by the next
available
Adaptation: pairwise loss function
• Motivation
 Goal of recommender: ranking
 Pairwise and pointwise ranking (listwise costly)
 Pairwise often better
• Pairwise loss functions
 Positive items compared to negatives
 BPR
o Bayesian personalized ranking
o 𝐿 = −
1
𝑁 𝑆
𝑗=1
𝑁 𝑆
log 𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗
 TOP1
o Regularized approximation of the relative rank of the positive item
o 𝐿 =
1
𝑁 𝑆
𝑗=1
𝑁 𝑆
𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗 + 𝜎 𝑟𝑠,𝑗
2
Adaptation: sampling the output
• Motivation
 Number of items is high  bottleneck
 Model needs to be trained frequently (should be quick)
• Sampling negative items
 Popularity based sampling
o Missing event on popular item  more likely sign of negative feedback
o Pop items often get large scores  faster learning
 Negative items for an example: examples of other sessions in the minibatch
o Technical benefits
o Follows data distribution (pop sampling)
𝑖1 𝑖5 𝑖8
Mini-batch
(desired items)
𝑦1
1
𝑦2
1
𝑦3
1 𝑦4
1
𝑦5
1
𝑦6
1 𝑦7
1
𝑦8
1
𝑦1
3
𝑦2
3
𝑦3
3
𝑦4
3
𝑦5
3
𝑦6
3
𝑦7
3
𝑦8
3
𝑦1
2
𝑦2
2
𝑦3
2
𝑦4
2 𝑦5
2
𝑦6
2
𝑦7
2 𝑦8
2
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 1 0 0 0
Network output
(scores)
Desired output scores
Positive
item
Sampled
negative items
Inactive outputs
(not computed)
Offline experiments
Data Description Items Train Test
Sessions Events Sessions Events
RSC15 RecSys Challenge 2015.
Clickstream data of a webshop.
37,483 7,966,257 31,637,239 15,324 71,222
VIDEO Video watch sequences. 327,929 2,954,816 13,180,128 48,746 178,637
+19.91% +19.82%
+15.55%+14.06%
+24.82%
+22.54%
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7 RSC15 - Recall@20
+18.65% +17.54%
+12.58% +5.16%
+20.47%
+31.49%
0
0.1
0.2
0.3 RSC15 - MRR@20
+15.69%
+8.92% +11.50%
N/A
+14.58%
+20.27%
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
VIDEO - Recall@20
+10.04%
-3.56% +3.84%
N/A
-7.23%
+15.08%
0
0.1
0.2
0.3
0.4
VIDEO - MRR@20
Pop
Sessionpop
Item-kNN
BPR-MF
GRU4Rec(100,unitsc.-entropy)
GRU4Rec(100units,BPR)
GRU4Rec(100units,TOP1)
GRU4Rec(1000units,c.-entropy)
GRU4Rec(1000units,BPR)
GRU4Rec(1000units,TOP1)
Pop
Sessionpop
Item-kNN
BPR-MF
GRU4Rec(100,unitsc.-entropy)
GRU4Rec(100units,BPR)
GRU4Rec(100units,TOP1)
GRU4Rec(1000units,c.-entropy)
GRU4Rec(1000units,BPR)
GRU4Rec(1000units,TOP1)
Pop
Sessionpop
Item-kNN
BPR-MF
GRU4Rec(100,unitsc.-entropy)
GRU4Rec(100units,BPR)
GRU4Rec(100units,TOP1)
GRU4Rec(1000units,TOP1)
GRU4Rec(1000units,BPR)
Pop
Sessionpop
Item-kNN
BPR-MF
GRU4Rec(100,unitsc.-entropy)
GRU4Rec(100units,BPR)
GRU4Rec(100units,TOP1)
GRU4Rec(1000units,TOP1)
GRU4Rec(1000units,BPR)
Online experiments
+17.09% +16.10%
+24.16% +23.69%
+5.52%
-3.21%
+7.05% +6.29
0
0.2
0.4
0.6
0.8
1
1.2
1.4
RelativeCTR
RNN Item-kNN Item-kNN-B
CTR
Default setup RNN
• Default trains:
 on ~10x events
 more frequently
• Absolute CTR increase: +0.9%±0.5%
 (p=0.01)
The next step in recsys technology
• is deep learning
• Besides session modelling
 Incorporating content into the model directly
 Modeling complex context-states based on sensory data
(IoT)
 Optimizing recommendations through deep reinforcement
learning
• Would you like to try something in this area?
 Submit to DLRS 2016
 dlrs-workshop.org
Thank you!
Detailed description of the RNN approach:
• B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk: Session-based recommendations with recurrent neural networks. ICLR 2016.
• http://arxiv.org/abs/1511.06939
• Public code: https://github.com/hidasib/GRU4Rec
1 of 17

Recommended

Parallel Recurrent Neural Network Architectures for Feature-rich Session-base... by
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi
2.3K views15 slides
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec... by
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
3.2K views30 slides
Utilizing additional information in factorization methods (research overview,... by
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
628 views24 slides
Context-aware preference modeling with factorization by
Context-aware preference modeling with factorizationContext-aware preference modeling with factorization
Context-aware preference modeling with factorizationBalázs Hidasi
1.7K views22 slides
Deep Learning in Recommender Systems - RecSys Summer School 2017 by
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
11.3K views77 slides
Deep learning: the future of recommendations by
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
15.1K views23 slides

More Related Content

What's hot

Deep Learning for Personalized Search and Recommender Systems by
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
37.4K views113 slides
Artificial Intelligence Course: Linear models by
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models ananth
1.4K views47 slides
Deep Learning for Recommender Systems RecSys2017 Tutorial by
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
32.2K views80 slides
Foundations: Artificial Neural Networks by
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networksananth
2.2K views50 slides
Artificial Intelligence, Machine Learning and Deep Learning by
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
22.9K views37 slides
Introduction To Applied Machine Learning by
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learningananth
2.2K views18 slides

What's hot(20)

Deep Learning for Personalized Search and Recommender Systems by Benjamin Le
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
Benjamin Le37.4K views
Artificial Intelligence Course: Linear models by ananth
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth1.4K views
Deep Learning for Recommender Systems RecSys2017 Tutorial by Alexandros Karatzoglou
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
Foundations: Artificial Neural Networks by ananth
Foundations: Artificial Neural NetworksFoundations: Artificial Neural Networks
Foundations: Artificial Neural Networks
ananth2.2K views
Artificial Intelligence, Machine Learning and Deep Learning by Sujit Pal
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal22.9K views
Introduction To Applied Machine Learning by ananth
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
ananth2.2K views
Deep Learning Models for Question Answering by Sujit Pal
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal14.4K views
An introduction to Machine Learning (and a little bit of Deep Learning) by Thomas da Silva Paula
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
Deep Learning For Practitioners, lecture 2: Selecting the right applications... by ananth
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth1K views
Deep Learning for Natural Language Processing by Sangwoo Mo
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Sangwoo Mo1.5K views
Deep learning and image analytics using Python by Dr Sanparit by BAINIDA
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
BAINIDA3.3K views
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B... by Alexandros Karatzoglou
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas... by Sujit Pal
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Sujit Pal3.9K views
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R... by Sangwoo Mo
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Sangwoo Mo785 views
MaxEnt (Loglinear) Models - Overview by ananth
MaxEnt (Loglinear) Models - OverviewMaxEnt (Loglinear) Models - Overview
MaxEnt (Loglinear) Models - Overview
ananth2K views
Generative Models for General Audiences by Sangwoo Mo
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo582 views
Deep Learning: Chapter 11 Practical Methodology by Jason Tsai
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical Methodology
Jason Tsai2K views
Deep Learning with Python (PyData Seattle 2015) by Alexander Korbonits
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
Alexander Korbonits22.8K views
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk by Saurabh Saxena
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena214 views

Viewers also liked

2017 How Deep Learning Changes the Design Process (1) by
2017 How Deep Learning Changes the Design Process (1)2017 How Deep Learning Changes the Design Process (1)
2017 How Deep Learning Changes the Design Process (1)Alexander Meinhardt
683 views149 slides
Creativity through deep learning by
Creativity through deep learningCreativity through deep learning
Creativity through deep learningAkin Osman Kazakci
748 views13 slides
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의 by
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의Minha Yang
10.2K views28 slides
2017 How Deep Learning Changes the Design Process (2) by
2017 How Deep Learning Changes the Design Process (2)2017 How Deep Learning Changes the Design Process (2)
2017 How Deep Learning Changes the Design Process (2)Alexander Meinhardt
924 views34 slides
Artificial Intelligence - Past, Present and Future by
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureGrigory Sapunov
47.9K views164 slides
How Deep Learning Changes the Design Process #NEXT17 by
How Deep Learning Changes the Design Process #NEXT17How Deep Learning Changes the Design Process #NEXT17
How Deep Learning Changes the Design Process #NEXT17Alexander Meinhardt
1.7K views167 slides

Viewers also liked(6)

2017 How Deep Learning Changes the Design Process (1) by Alexander Meinhardt
2017 How Deep Learning Changes the Design Process (1)2017 How Deep Learning Changes the Design Process (1)
2017 How Deep Learning Changes the Design Process (1)
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의 by Minha Yang
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의
쉬운 예제 중심의 HTML5 / CSS / MediaQuery / JQuery 강의
Minha Yang10.2K views
2017 How Deep Learning Changes the Design Process (2) by Alexander Meinhardt
2017 How Deep Learning Changes the Design Process (2)2017 How Deep Learning Changes the Design Process (2)
2017 How Deep Learning Changes the Design Process (2)
Artificial Intelligence - Past, Present and Future by Grigory Sapunov
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and Future
Grigory Sapunov47.9K views
How Deep Learning Changes the Design Process #NEXT17 by Alexander Meinhardt
How Deep Learning Changes the Design Process #NEXT17How Deep Learning Changes the Design Process #NEXT17
How Deep Learning Changes the Design Process #NEXT17
Alexander Meinhardt1.7K views

Similar to Deep learning to the rescue - solving long standing problems of recommender systems

Deep learning: what? how? why? How to win a Kaggle competition by
Deep learning: what? how? why? How to win a Kaggle competitionDeep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competition317070
892 views75 slides
Overcome the Reign of Chaos by
Overcome the Reign of ChaosOvercome the Reign of Chaos
Overcome the Reign of ChaosMichael Stockerl
155 views62 slides
DEF CON 24 - Clarence Chio - machine duping 101 by
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
70 views55 slides
Deep learning - a primer by
Deep learning - a primerDeep learning - a primer
Deep learning - a primerUwe Friedrichsen
2.8K views137 slides
Deep learning - a primer by
Deep learning - a primerDeep learning - a primer
Deep learning - a primerShirin Elsinghorst
4.7K views137 slides
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning by
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningAI Frontiers
966 views44 slides

Similar to Deep learning to the rescue - solving long standing problems of recommender systems(20)

Deep learning: what? how? why? How to win a Kaggle competition by 317070
Deep learning: what? how? why? How to win a Kaggle competitionDeep learning: what? how? why? How to win a Kaggle competition
Deep learning: what? how? why? How to win a Kaggle competition
317070892 views
DEF CON 24 - Clarence Chio - machine duping 101 by Felipe Prado
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
Felipe Prado70 views
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning by AI Frontiers
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
AI Frontiers966 views
ODSC 2019: Sessionisation via stochastic periods for root event identification by Kuldeep Jiwani
ODSC 2019: Sessionisation via stochastic periods for root event identificationODSC 2019: Sessionisation via stochastic periods for root event identification
ODSC 2019: Sessionisation via stochastic periods for root event identification
Kuldeep Jiwani300 views
Using Deep Learning to do Real-Time Scoring in Practical Applications by Greg Makowski
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
Greg Makowski4.5K views
Machine Learning with TensorFlow 2 by Sarah Stemmler
Machine Learning with TensorFlow 2Machine Learning with TensorFlow 2
Machine Learning with TensorFlow 2
Sarah Stemmler310 views
Paris ML meetup by Yves Raimond
Paris ML meetupParis ML meetup
Paris ML meetup
Yves Raimond96.5K views
Parismlmeetupfinalslides 151209190037-lva1-app6892 by mercedes calderon
Parismlmeetupfinalslides 151209190037-lva1-app6892Parismlmeetupfinalslides 151209190037-lva1-app6892
Parismlmeetupfinalslides 151209190037-lva1-app6892
Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics by Badrish Chandramouli
Impatience is a Virtue: Revisiting Disorder in High-Performance Log AnalyticsImpatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
rsec2a-2016-jheaton-morning by Jeff Heaton
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
Jeff Heaton496 views
Visibility Optimization for Games by Umbra
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
Umbra5.1K views
Approximate "Now" is Better Than Accurate "Later" by NUS-ISS
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
NUS-ISS134 views
Machine Duping 101: Pwning Deep Learning Systems by Clarence Chio
Machine Duping 101: Pwning Deep Learning SystemsMachine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning Systems
Clarence Chio1.9K views

More from Balázs Hidasi

Egyedi termék kreatívok tömeges gyártása generatív AI segítségével by
Egyedi termék kreatívok tömeges gyártása generatív AI segítségévelEgyedi termék kreatívok tömeges gyártása generatív AI segítségével
Egyedi termék kreatívok tömeges gyártása generatív AI segítségévelBalázs Hidasi
45 views57 slides
The Effect of Third Party Implementations on Reproducibility by
The Effect of Third Party Implementations on ReproducibilityThe Effect of Third Party Implementations on Reproducibility
The Effect of Third Party Implementations on ReproducibilityBalázs Hidasi
114 views41 slides
Context aware factorization methods for implicit feedback based recommendatio... by
Context aware factorization methods for implicit feedback based recommendatio...Context aware factorization methods for implicit feedback based recommendatio...
Context aware factorization methods for implicit feedback based recommendatio...Balázs Hidasi
524 views34 slides
Approximate modeling of continuous context in factorization algorithms (CaRR1... by
Approximate modeling of continuous context in factorization algorithms (CaRR1...Approximate modeling of continuous context in factorization algorithms (CaRR1...
Approximate modeling of continuous context in factorization algorithms (CaRR1...Balázs Hidasi
1.4K views17 slides
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad... by
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...Balázs Hidasi
383 views16 slides
Context-aware similarities within the factorization framework (CaRR 2013 pres... by
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Balázs Hidasi
1.2K views17 slides

More from Balázs Hidasi(11)

Egyedi termék kreatívok tömeges gyártása generatív AI segítségével by Balázs Hidasi
Egyedi termék kreatívok tömeges gyártása generatív AI segítségévelEgyedi termék kreatívok tömeges gyártása generatív AI segítségével
Egyedi termék kreatívok tömeges gyártása generatív AI segítségével
Balázs Hidasi45 views
The Effect of Third Party Implementations on Reproducibility by Balázs Hidasi
The Effect of Third Party Implementations on ReproducibilityThe Effect of Third Party Implementations on Reproducibility
The Effect of Third Party Implementations on Reproducibility
Balázs Hidasi114 views
Context aware factorization methods for implicit feedback based recommendatio... by Balázs Hidasi
Context aware factorization methods for implicit feedback based recommendatio...Context aware factorization methods for implicit feedback based recommendatio...
Context aware factorization methods for implicit feedback based recommendatio...
Balázs Hidasi524 views
Approximate modeling of continuous context in factorization algorithms (CaRR1... by Balázs Hidasi
Approximate modeling of continuous context in factorization algorithms (CaRR1...Approximate modeling of continuous context in factorization algorithms (CaRR1...
Approximate modeling of continuous context in factorization algorithms (CaRR1...
Balázs Hidasi1.4K views
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad... by Balázs Hidasi
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...
Az implicit ajánlási probléma és néhány megoldása (BME TMIT szeminárium előad...
Balázs Hidasi383 views
Context-aware similarities within the factorization framework (CaRR 2013 pres... by Balázs Hidasi
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...
Balázs Hidasi1.2K views
iTALS: implicit tensor factorization for context-aware recommendations (ECML/... by Balázs Hidasi
iTALS: implicit tensor factorization for context-aware recommendations (ECML/...iTALS: implicit tensor factorization for context-aware recommendations (ECML/...
iTALS: implicit tensor factorization for context-aware recommendations (ECML/...
Balázs Hidasi637 views
Initialization of matrix factorization (CaRR 2012 presentation) by Balázs Hidasi
Initialization of matrix factorization (CaRR 2012 presentation)Initialization of matrix factorization (CaRR 2012 presentation)
Initialization of matrix factorization (CaRR 2012 presentation)
Balázs Hidasi737 views
ShiftTree: model alapú idősor-osztályozó (VK 2009 előadás) by Balázs Hidasi
ShiftTree: model alapú idősor-osztályozó (VK 2009 előadás)ShiftTree: model alapú idősor-osztályozó (VK 2009 előadás)
ShiftTree: model alapú idősor-osztályozó (VK 2009 előadás)
Balázs Hidasi327 views
ShiftTree: model alapú idősor-osztályozó (ML@BP előadás, 2012) by Balázs Hidasi
ShiftTree: model alapú idősor-osztályozó (ML@BP előadás, 2012)ShiftTree: model alapú idősor-osztályozó (ML@BP előadás, 2012)
ShiftTree: model alapú idősor-osztályozó (ML@BP előadás, 2012)
Balázs Hidasi357 views
ShiftTree: model based time series classifier (ECML/PKDD 2011 presentation) by Balázs Hidasi
ShiftTree: model based time series classifier (ECML/PKDD 2011 presentation)ShiftTree: model based time series classifier (ECML/PKDD 2011 presentation)
ShiftTree: model based time series classifier (ECML/PKDD 2011 presentation)
Balázs Hidasi726 views

Recently uploaded

Disinfectants & Antiseptic by
Disinfectants & AntisepticDisinfectants & Antiseptic
Disinfectants & AntisepticSanket P Shinde
62 views36 slides
MILK LIPIDS 2.pptx by
MILK LIPIDS 2.pptxMILK LIPIDS 2.pptx
MILK LIPIDS 2.pptxabhinambroze18
8 views15 slides
plasmids by
plasmidsplasmids
plasmidsscribddarkened352
13 views2 slides
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Anmol Vishnu Gupta
7 views10 slides
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor... by
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...Trustlife
100 views17 slides
vitamine B1.pptx by
vitamine B1.pptxvitamine B1.pptx
vitamine B1.pptxajithkilpart
27 views22 slides

Recently uploaded(20)

Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by Anmol Vishnu Gupta
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor... by Trustlife
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...
Ellagic Acid and Its Metabolites as Potent and Selective Allosteric Inhibitor...
Trustlife100 views
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F... by SwagatBehera9
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
SwagatBehera95 views
ELECTRON TRANSPORT CHAIN by DEEKSHA RANI
ELECTRON TRANSPORT CHAINELECTRON TRANSPORT CHAIN
ELECTRON TRANSPORT CHAIN
DEEKSHA RANI10 views
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana... by jahnviarora989
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
jahnviarora9896 views
별헤는 사람들 2023년 12월호 전명원 교수 자료 by sciencepeople
별헤는 사람들 2023년 12월호 전명원 교수 자료별헤는 사람들 2023년 12월호 전명원 교수 자료
별헤는 사람들 2023년 12월호 전명원 교수 자료
sciencepeople58 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles1.2K views
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 views
Experimental animal Guinea pigs.pptx by Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya35 views
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe... by Anmol Vishnu Gupta
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Light Pollution for LVIS students by CWBarthlmew
Light Pollution for LVIS studentsLight Pollution for LVIS students
Light Pollution for LVIS students
CWBarthlmew9 views
Exploring the nature and synchronicity of early cluster formation in the Larg... by Sérgio Sacani
Exploring the nature and synchronicity of early cluster formation in the Larg...Exploring the nature and synchronicity of early cluster formation in the Larg...
Exploring the nature and synchronicity of early cluster formation in the Larg...
Sérgio Sacani910 views

Deep learning to the rescue - solving long standing problems of recommender systems

  • 1. Deep learning to the rescue solving long standing problems of recommender systems Balázs Hidasi @balazshidasi Budapest RecSys & Personalization meetup 12 May, 2016
  • 2. What is deep learning? • A class of machine learning algorithms  that use a cascade of multiple non-linear processing layers  and complex model structures  to learn different representations of the data in each layer  where higher level features are derived from lower level features  to form a hierarchical representation. • Key component of recent technologies  Speech recognition  Personal assistants (e.g. Siri, Cortana)  Computer vision, object recognition  Machine translation  Chatbot technology  Face recognition  Self driving cars • An efficient tool for certain complex problems  Pattern recognition  Computer vision  Natural language processing  Speech recognition • Deep learning is NOT  the true AI o it may be a component of it when and if AI is created  how the human brain works  the best solution to every machine learning tasks
  • 3. Deep learning in the news
  • 4. Why is deep learning happening now? • Actually it is not   first papers published in 1970s • Third resurgence of neural networks  Research breakthroughs  Increase in computational power  GP GPUs Problem Solution Vanishing gradients Sigmoid type activation functions easily saturate. Gradients are small, in deeper layers updates become almost zero. Earlier: layer-by-layer pretraining Recently: non-saturating activation functions Gradient descent First order methods (e.g. SGD) are easily stuck. Second order methods are infeasible on larger data. Adaptive training: adagrad, adam, adadelta, RMSProp Nesterov momentum Regularization Networks easily overfit (even with L2 regularization). Dropout ETC...
  • 5. Challenges in RecSys • Recommender systems ≠ Netflix challenge  Rating prediction  Top-N recommendation (ranking)  Explicit feedback  Implicit feedback  Long user histories  Sessions  Slowly changing taste  Goal oriented browsing  Item-to-user only  Other scenarios • Success of CF  Human brain is a powerful feature extractor  Cold-start o CF can’t be used o Decisions are rarely made on metadata o But rather on what the user sees: e.g. product image, content itself Domain dependent
  • 6. Session-based recommendations • Permanent cold start  User identification o Possible but often not reliable  Intent/theme o What the user needs? o Theme of the session  Never/rarely returning users • Workaround in practice  Item-to-item recommendations o Similar items o Co-occurring items  Non-personalized  Not adaptive
  • 7. Recurrent Neural Networks • Hidden state  Next hidden state depends on the input and the actual hidden state (recurrence)  ℎ 𝑡 = tanh 𝑊𝑥 𝑡 + 𝑈ℎ 𝑡−1 • „Infinite depth” • Backpropagation Through Time • Exploding gradients  Due to recurrence  If the spectral radius of U > 1 (necessary) • Lack of long term memory (vanishing gradients)  Gradients of earlier states vanish  If the spectral radius of U < 1 (sufficient) ℎ 𝑥𝑡 ℎ 𝑡 ℎ ℎ 𝑡 𝑥𝑡 ℎℎ 𝑡−1 𝑥𝑡−1 ℎℎ 𝑡−2 𝑥𝑡−2 ℎℎ 𝑡−3 𝑥𝑡−3 ℎ 𝑡−4
  • 8. Advanced RNN units • Long Short-Term Memory (LSTM)  Memory cell (𝑐𝑡) is the mix of o its previous value (governed by the forget gate (𝑓𝑡)) o the cell value candidate (governed by the input gate (𝑖 𝑡))  Cell value candidate ( 𝑐𝑡) depends on the input and the previous hidden state  Hidden state is the memory cell regulated by the output gate (𝑜𝑡)  No vanishing/exploding gradients • 𝑓𝑡 𝑖 𝑡 = 𝜎 𝑊 𝑓 𝑊 𝑖 𝑥𝑡 + 𝑈 𝑓 𝑈 𝑖 ℎ 𝑡−1 + 𝑉 𝑓 𝑉 𝑖 𝑐𝑡−1 • 𝑜𝑡 = 𝜎 𝑊 𝑜 𝑥𝑡 + 𝑈 𝑜 ℎ 𝑡−1 + 𝑉 𝑜 𝑐𝑡−1 • 𝑐𝑡 = tanh 𝑊𝑥𝑡 + 𝑈ℎ 𝑡−1 • 𝑐𝑡 = tanh 𝑓𝑡 𝑐𝑡−1 + 𝑖 𝑡 𝑐𝑡 • ℎ 𝑡 = 𝑜𝑡 𝑐𝑡 • Gated Recurrent Unit (GRU)  Hidden state is the mix of o the previous hidden state o the hidden state candidate (ℎ 𝑡) o governed by the update gate (𝑧𝑡) – merged input+forget gate  Hidden state candidate depends on the input and the previous hidden state through a reset gate (𝑟𝑡)  Similar performance  Less calculations • 𝑧𝑡 = 𝜎 𝑊 𝑧 𝑥𝑡 + 𝑈 𝑧 ℎ 𝑡−1 • 𝑟𝑡 = 𝜎 𝑊 𝑟 𝑥𝑡 + 𝑈 𝑟 ℎ 𝑡−1 • ℎ 𝑡 = 𝜎 𝑊𝑥𝑡 + 𝑈(𝑟𝑡∘ ℎ 𝑡−1) • ℎ 𝑡 = 1 − 𝑧𝑡 ℎ 𝑡−1 + 𝑧𝑡ℎ 𝑡 ℎℎ 𝑥𝑡 ℎ 𝑡 𝑧 ℎ𝑐 𝑐 𝑥𝑡 ℎ 𝑡
  • 9. Powered by RNN • Sequence labeling  Document classification  Speech recognition • Sequence-to-sequence learning  Machine translation  Question answering  Conversations • Sequence generation  Music  Text
  • 10. Session modeling with RNNs • Input: actual item of session • Output: score on items for being the next in the event stream • GRU based RNN  RNN is worse  LSTM is slower (same accuracy) • Optional embedding and feedforward layers  Better results without • Number of layers  1 gave the best performance  Sessions span over short timeframes  No need for modeling on multiple scales • Requires some adaptation Feedforward layers Embedding layers … Output: scores on all items GRU layer GRU layer GRU layer Input: actual item, 1-of-N coding (optional) (optional)
  • 11. Adaptation: session parallel mini-batches • Motivation  High variance in the length of the sessions (from 2 to 100s of events)  The goal is to capture how sessions evolve • Minibatch  Input: current evets  Output: next events 𝑖1,1 𝑖1,2 𝑖1,3 𝑖1,4 𝑖2,1 𝑖2,2 𝑖2,3 𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6 𝑖4,1 𝑖4,2 𝑖5,1 𝑖5,2 𝑖5,3 Session1 Session2 Session3 Session4 Session5 𝑖1,1 𝑖1,2 𝑖1,3 𝑖2,1 𝑖2,2 𝑖3,1 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖4,1 𝑖5,1 𝑖5,2 Input (item of the actual event) Desired output (next item in the event stream) … … … Mini-batch1 Mini-batch3 Mini-batch2 𝑖1,2 𝑖1,3 𝑖1,4 𝑖2,2 𝑖2,3 𝑖3,2 𝑖3,3 𝑖3,4 𝑖3,5 𝑖3,6 𝑖4,2 𝑖5,2 𝑖5,3 … … … • Active sessions  First X  Finished sessions replaced by the next available
  • 12. Adaptation: pairwise loss function • Motivation  Goal of recommender: ranking  Pairwise and pointwise ranking (listwise costly)  Pairwise often better • Pairwise loss functions  Positive items compared to negatives  BPR o Bayesian personalized ranking o 𝐿 = − 1 𝑁 𝑆 𝑗=1 𝑁 𝑆 log 𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗  TOP1 o Regularized approximation of the relative rank of the positive item o 𝐿 = 1 𝑁 𝑆 𝑗=1 𝑁 𝑆 𝜎 𝑟𝑠,𝑖 − 𝑟𝑠,𝑗 + 𝜎 𝑟𝑠,𝑗 2
  • 13. Adaptation: sampling the output • Motivation  Number of items is high  bottleneck  Model needs to be trained frequently (should be quick) • Sampling negative items  Popularity based sampling o Missing event on popular item  more likely sign of negative feedback o Pop items often get large scores  faster learning  Negative items for an example: examples of other sessions in the minibatch o Technical benefits o Follows data distribution (pop sampling) 𝑖1 𝑖5 𝑖8 Mini-batch (desired items) 𝑦1 1 𝑦2 1 𝑦3 1 𝑦4 1 𝑦5 1 𝑦6 1 𝑦7 1 𝑦8 1 𝑦1 3 𝑦2 3 𝑦3 3 𝑦4 3 𝑦5 3 𝑦6 3 𝑦7 3 𝑦8 3 𝑦1 2 𝑦2 2 𝑦3 2 𝑦4 2 𝑦5 2 𝑦6 2 𝑦7 2 𝑦8 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 Network output (scores) Desired output scores Positive item Sampled negative items Inactive outputs (not computed)
  • 14. Offline experiments Data Description Items Train Test Sessions Events Sessions Events RSC15 RecSys Challenge 2015. Clickstream data of a webshop. 37,483 7,966,257 31,637,239 15,324 71,222 VIDEO Video watch sequences. 327,929 2,954,816 13,180,128 48,746 178,637 +19.91% +19.82% +15.55%+14.06% +24.82% +22.54% 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 RSC15 - Recall@20 +18.65% +17.54% +12.58% +5.16% +20.47% +31.49% 0 0.1 0.2 0.3 RSC15 - MRR@20 +15.69% +8.92% +11.50% N/A +14.58% +20.27% 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 VIDEO - Recall@20 +10.04% -3.56% +3.84% N/A -7.23% +15.08% 0 0.1 0.2 0.3 0.4 VIDEO - MRR@20 Pop Sessionpop Item-kNN BPR-MF GRU4Rec(100,unitsc.-entropy) GRU4Rec(100units,BPR) GRU4Rec(100units,TOP1) GRU4Rec(1000units,c.-entropy) GRU4Rec(1000units,BPR) GRU4Rec(1000units,TOP1) Pop Sessionpop Item-kNN BPR-MF GRU4Rec(100,unitsc.-entropy) GRU4Rec(100units,BPR) GRU4Rec(100units,TOP1) GRU4Rec(1000units,c.-entropy) GRU4Rec(1000units,BPR) GRU4Rec(1000units,TOP1) Pop Sessionpop Item-kNN BPR-MF GRU4Rec(100,unitsc.-entropy) GRU4Rec(100units,BPR) GRU4Rec(100units,TOP1) GRU4Rec(1000units,TOP1) GRU4Rec(1000units,BPR) Pop Sessionpop Item-kNN BPR-MF GRU4Rec(100,unitsc.-entropy) GRU4Rec(100units,BPR) GRU4Rec(100units,TOP1) GRU4Rec(1000units,TOP1) GRU4Rec(1000units,BPR)
  • 15. Online experiments +17.09% +16.10% +24.16% +23.69% +5.52% -3.21% +7.05% +6.29 0 0.2 0.4 0.6 0.8 1 1.2 1.4 RelativeCTR RNN Item-kNN Item-kNN-B CTR Default setup RNN • Default trains:  on ~10x events  more frequently • Absolute CTR increase: +0.9%±0.5%  (p=0.01)
  • 16. The next step in recsys technology • is deep learning • Besides session modelling  Incorporating content into the model directly  Modeling complex context-states based on sensory data (IoT)  Optimizing recommendations through deep reinforcement learning • Would you like to try something in this area?  Submit to DLRS 2016  dlrs-workshop.org
  • 17. Thank you! Detailed description of the RNN approach: • B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk: Session-based recommendations with recurrent neural networks. ICLR 2016. • http://arxiv.org/abs/1511.06939 • Public code: https://github.com/hidasib/GRU4Rec