SlideShare a Scribd company logo
Session-aware Linear Item-Item Models for
Session-based Recommendation
Minjin Choi1, Jinhong Kim1, Joonseok Lee2,3 , Hyunjung Shim4, Jongwuk Lee1
Sungkyunkwan University (SKKU)1, Google Research2
Seoul National University3, Yonsei University4
Motivation
2
Session-based Recommendation (SR)
Predicting the next item(s) based on only the current session
• Usually, session histories are much shorter than user histories.
3
User
6, January
13, January
17, February
Anonymous 1
Anonymous 2
Anonymous 3
Limitation of Existing SR Models
DNN models show good performance but suffer scalability issues.
• RNNs and GNNs are mostly used for session-based recommendations.
• When the dataset is too large, the scalability issue arises.
4
Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016
GRU Cell GRU Cell GRU Cell GRU Cell
Limitation of Existing SR Models
Neighborhood-based models are difficult to capture complex
dependencies between items in a session.
5
Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019
𝒔𝒊𝒎 = 𝟎. 𝟔
𝑠𝑖𝑚 = 0.4 Top-1 Neighbor session
Target session
Research Question
6
How to build the accurate and scalable
session-based recommender model?
Our Key Contributions
We utilize linear models for session-based recommendations.
• Linear models show great performance in traditional recommendations.
• However, simply applying them to session-based recommendations does not
show an effective performance.
7
Item-item model 𝑩
𝑩 ∈ ℝ𝒏×𝒏
User-item matrix 𝑿
𝑿 ∈ ℝ𝒎×𝒏
Dataset: YC-1/64
Models HR@20 MRR@20
GRU4Rec+ 0.6528 0.2752
SKNN 0.6423 0.2522
SEASER 0.5443 0.1963
Harald Steck, “Embarrassingly Shallow Autoencoders for Sparse Data”, WWW 2019
Worse than existing models!
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑿 − 𝑿𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
s. t. 𝒅𝒊𝒂𝒈 𝑩 = 𝟎
𝒎: # of users
𝒏: # of items
Our Key Contributions
We design linear models to utilize some unique characteristics of
sessions.
• Some items tend to be consumed in a specific order. (dependency)
• Items in a session are often highly coherent. (consistency)
8
6, January
13, January
17, February
Session consistency
(tops)
Sequential dependency
(pants -> shoes)
Our Key Contributions
We design linear models to utilize some unique characteristics of
sessions.
• Sessions often reflect a recent trend. (timeliness)
• The user might consume the same items in a session. (repeated item)
9
6, January
13, January
17, February
Repeated item consumption
Timeliness of sessions
Proposed Model
10
Overview of the Proposed Model
We devise two linear models to learn item similarity/transition and
unify them to fully capture the complex relationships of items.
11
Full session
representation
Partial session
representation
Item
similarity
model
Item
transition
model
Item
similarity/
transition
model
Two Session Representations
We deal with two session representations for our linear models.
• (1) A single vector to capture the correlations between items while ignoring
the order.
• (2) Past and future vectors to represent sequential dependency of items.
12
Full session
representation
Partial session
representation
past future
Session-aware Item Similarity Model (SLIS)
We learn the linear model using full session representation, to
represent the similarity between items.
• The entire session is represented as a single vector.
• 𝐵𝑖𝑗 indicates the learned item similarity between 𝑖 and 𝑗.
• Note: the diagonal constraint is relaxed.
13
Full session representation 𝑿
𝑿 ∈ ℝ|𝑺|×𝑵
Item similarity model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Dark cells mean high similarities.
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑿 − 𝑿𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
s. t. 𝒅𝒊𝒂𝒈 𝑩 ≤ 𝝃
|𝑺|: # of sessions
𝒏: # of items
Session-aware Item Transition Model (SLIT)
We learn the linear model using partial session representation that
captures the sequential dependency across items.
• A session is divided into past vectors and future vectors.
• 𝐵𝑖𝑗 indicates the item transitions from 𝑖 to 𝑗.
14
Item Transition Model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Partial session representation 𝑺𝑻
𝑺, 𝑻 ∈ ℝ|𝑺′|×𝑵
past future
𝐚𝐫𝐠𝐦𝐢𝐧
𝑩
𝑻 − 𝑺𝑩 𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
Dark cells mean the high transition
probability.
|𝑺′|: # of partial sessions
𝒏: # of items
Item Similarity/Transition Model: SLIST
We unify two linear models by jointly optimizing both models.
15
Item similarity/transition model 𝑩
𝑩 ∈ ℝ𝑵×𝑵
Full session representation
Partial session representation
Item similarity model
Item Transition Model
Weight Decay of Items in a Session
We assign higher weights for more recent sessions (𝒘𝒕𝒊𝒎𝒆) and
current items (𝒘𝒑𝒐𝒔, 𝒘𝒊𝒏𝒇) in the sessions.
16
Full and partial session representation
𝒘𝒕𝒊𝒎𝒆
𝒐𝒓𝒅𝒆𝒓
𝒘𝒑𝒐𝒔
past future
𝒕𝒊𝒎𝒆
Detail: Training and Inference
The objective function of SLIST
Model inference
17
⋅
𝒕 = 𝒔 ⋅ 𝑩
Partial
representation vector
Learned
item-item model
𝒂𝒓𝒈𝒎𝒊𝒏
𝑩
𝜶 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩
𝑭
𝟐
+ 𝟏 − 𝜶 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
𝑩 = 𝑰 − 𝝀𝑷 − 𝟏 − 𝜶 𝑷𝑺⊤𝒅𝒊𝒂𝒈𝑴𝒂𝒕(𝒘𝒑𝒂𝒓
𝟐 )(𝑺 − 𝑻)
Closed form
solution
Item similarity model (SLIS) Item transition model (SLIT)
𝒘𝒊𝒏𝒇
𝒐𝒓𝒅𝒆𝒓
Experiments
18
Experimental Setup: Dataset
We evaluate the accuracy and scalability of SLIST over public
datasets.
• We use both 5-split and 1-split datasets.
19
Split Dataset # of actions # of sessions # of items
1-split
YooChoose 1/64
(YC-1/64)
494,330 119,287 17,319
YooChoose 1/4
(YC-1/4)
7,909,307 1,939,891 30,638
DIGINETICA
(DIGI1)
916,370 188,807 43,105
5-split
YooChoose
(YC5)
5,426,961 1,375,128 28,582
DIGINETICA
(DIGI5)
203,488 41,755 32,137
Evaluation Protocol and Metrics
Evaluation protocol: iterative revealing scheme
• We iteratively expose the item of a session to the model.
Evaluation metrics
• HR@20 and MRR@20
• To predict only the next item in a session
• R@20 and MAP@20
• To consider all subsequent items for a session
20
Time
Session history:
(1) Input & target:
(2) Input & target:
(4) Input & target:
…
Competitive Models
Two neighborhood-based models
• SKNN: a session-based KNN algorithm, a variant of user-based KNN.
• STAN: an improved version of SKNN by considering sequential dependency
and timeliness of sessions.
Four DNN-based models
• GRU4Rec+: an improved version of GRU4REC using the top-k gain.
• NARM: an improved version of GRU4REC+ using an attention mechanism.
• STAMP: an attention-based model for capturing user’s interests.
• SR-GNN: a GNN-based model to capture complex dependency.
21
Dietmar Jannach et. al., “When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation”, RecSys 2017
Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019
Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016.
Jing Li et. al., “Neural Attentive Session-based Recommendation” CIKM 2017.
Qiao Liu et. al., “STAMP: ShortTerm Attention/Memory Priority Model for Session-based Recommendation” KDD 2018.
Shu Wu et. al., “Session-Based Recommendation with Graph Neural Networks”, AAAI 2019.
Accuracy: Ours vs. Competing Models
22
Our models consistently show competitive performance.
• No existing models show state-of-the-art performance for all datasets.
Dataset Metric SKNN STAN GRU4Rec+ NARM STAMP SR-GNN SLIS SLIT SLIST Gain(%)
YC-1/64
HR@20 0.6423 0.6838 0.6528 0.6998 0.6841 0.7021 0.7015 0.6968 0.7088 0.95
R@20 0.4780 0.4955 0.4009 0.5051 0.4904 0.5049 0.5051 0.4976 0.5080 0.57
YC-1/4
HR@20 0.6329 0.6846 0.6940 0.7079 0.7021 0.7118 0.7119 0.7149 0.7175 0.80
R@20 0.4756 0.4952 0.4887 0.5097 0.5008 0.5095 0.5121 0.5110 0.5130 0.65
DIGI1
HR@20 0.4846 0.5121 0.5097 0.4979 0.4690 0.4904 0.5247 0.4729 0.5291 3.32
R@20 0.3788 0.3965 0.3957 0.3890 0.3663 0.3811 0.4062 0.3651 0.4091 3.18
YC5
HR@20 0.5996 0.6656 0.6488 0.6751 0.6654 0.6713 0.6786 0.6838 0.6867 1.72
R@20 0.4658 0.4986 0.4837 0.5109 0.4979 0.5060 0.5074 0.5097 0.5122 0.25
DIGI5
HR@20 0.4748 0.4800 0.4639 0.4188 0.3917 0.4158 0.4939 0.4193 0.5005 4.27
R@20 0.3715 0.3720 0.3617 0.3254 0.3040 0.3232 0.3867 0.3260 0.3898 4.78
Neighborhood-based DNN-based Ours
Scalability: Ours vs. Competing Models
Training SLIST is much faster than training DNN-based models.
• Computational complexity is mainly proportional to the number of items.
• It is highly desirable in practice, especially on popular online services, where
millions of users create billions of session data every day.
23
Models
The ratio of training sessions on YC-1/4
5% 10% 20% 50% 100%
GRU4Rec+ 177.4 317.8 614.0 1600.2 3206.9
NARM 2137.6 3431.2 7454.2 27804.5 72076.8
STAMP 434.1 647.5 985.3 2081.2 4083.3
SR-GNN 6780.5 18014.7 31444.3 68581.9 185862.5
SLIST 202.7 199.0 199.9 227.0 241.9
Gains
(SLIST vs. SR-GNN)
33.5x 90.6x 157.3x 302.1x 768.2x
Ablation Study: Weight components
SLIST with all components is always better than others.
• For three datasets, considering recent items at the inference phase (𝑤𝑖𝑛𝑓) is
the most influential factor.
24
𝒘𝒊𝒏𝒇 𝒘𝒑𝒐𝒔 𝒘𝒕𝒊𝒎𝒆
YC-1/64 YC-1/4 DIGI1
HR MRR HR MRR HR MRR
O 0.6764 0.2697 0.6784 0.2723 0.5111 0.1817
O 0.6898 0.2980 0.6947 0.3005 0.5173 0.1852
O 0.6570 0.2651 0.6675 0.2693 0.5055 0.1789
O O 0.7078 0.3077 0.7096 0.3115 0.5253 0.1880
O O 0.6761 0.2697 0.6841 0.2753 0.5145 0.1820
O O 0.6880 0.2973 0.7004 0.3031 0.5202 0.1855
0.6592 0.2652 0.6626 0.2671 0.5025 0.1786
O O O 0.7088 0.3083 0.7175 0.3161 0.5291 0.1886
All Weights
Two Weights
One Weights
No Weights
Conclusion
25
Conclusion
We propose session-aware linear item models to fully capture
various characteristics of sessions.
• SLIST: session-aware linear similarity/transition model
Despite its simplicity, SLIST achieves competitive or state-of-the-art
performance on various datasets.
SLIST is highly scalable thanks to closed-form solutions of the linear
models.
• The computational complexity of SLIST is proportional to the number of
items, which is desirable in commercial systems.
26
Q&A
27
Code: https://github.com/jin530/SLIST
Email: zxcvxd@skku.edu
Appendix
28
Proof of Closed-form Solutions
Session-aware Linear Item Similarity Model (SLIS)
29
ℒ 𝑩 = 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
𝒔. 𝒕. 𝒅𝒊𝒂𝒈 𝑩 ≤ 𝝃
ℒ 𝑩, 𝝁 = 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
+ 𝟐𝝁⊤𝒅𝒊𝒂𝒈(𝑩 − 𝝃𝑰)
𝝏ℒ 𝑩, 𝝁
𝝏𝑩
= 𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 + 𝝀𝑰 𝑩 − 𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 + 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝝁
𝑩 = 𝑰 − 𝝀𝑷 − 𝑷 ⋅ 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝝁
𝑫𝒇𝒖𝒍𝒍 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒇𝒖𝒍𝒍
𝟐
𝑷 = 𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 + 𝝀𝑰
−𝟏
Proof of Closed-form Solutions
Session-aware Linear Item Transition Model (SLIT)
30
ℒ 𝑩 = 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
𝝏ℒ 𝑩
𝝏𝑩
= 𝑺⊤
𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 𝑩 − 𝑺⊤
𝑫𝒑𝒂𝒓𝑻
𝑩 = 𝑷 ⋅ (𝑺⊤𝑫𝒑𝒂𝒓𝑻)
𝑫𝒑𝒂𝒓 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒑𝒂𝒓
𝟐
𝑷 = 𝑺⊤𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰
−𝟏
Proof of Closed-form Solutions
Session-aware Linear Item Similarity/Transition Model (SLIST)
31
ℒ 𝑩 = 𝜶 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩
𝑭
𝟐
+ (𝟏 − 𝜶) 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩
𝑭
𝟐
+ 𝝀 𝑩 𝑭
𝟐
𝝏ℒ 𝑩
𝝏𝑩
= 𝜶𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 + (𝟏 − 𝜶)𝑺⊤
𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 𝑩 − 𝜶𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 − (𝟏 − 𝜶)𝑺⊤
𝑫𝒑𝒂𝒓𝑻
𝑩 = 𝑰 − 𝑷𝝀 − (𝟏 − 𝜶)𝑷𝑺⊤
𝑫𝒑𝒂𝒓(𝑺 − 𝑻)
𝑫𝒇𝒖𝒍𝒍 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒇𝒖𝒍𝒍
𝟐
, 𝑫𝒑𝒂𝒓 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒑𝒂𝒓
𝟐
𝑷 = 𝜶𝑿⊤
𝑫𝒇𝒖𝒍𝒍𝑿 + (𝟏 − 𝜶)𝑺⊤
𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰
−𝟏

More Related Content

What's hot

Overview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboostOverview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboost
Takami Sato
 
2014.01.23 prml勉強会4.2確率的生成モデル
2014.01.23 prml勉強会4.2確率的生成モデル2014.01.23 prml勉強会4.2確率的生成モデル
2014.01.23 prml勉強会4.2確率的生成モデルTakeshi Sakaki
 
Prml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティングPrml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティング
takutori
 
TabNetの論文紹介
TabNetの論文紹介TabNetの論文紹介
TabNetの論文紹介
西岡 賢一郎
 
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.42013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
Takeshi Sakaki
 
アンサンブル学習
アンサンブル学習アンサンブル学習
アンサンブル学習
Hidekazu Tanaka
 
PRML輪読#6
PRML輪読#6PRML輪読#6
PRML輪読#6
matsuolab
 
EMアルゴリズム
EMアルゴリズムEMアルゴリズム
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎
Deep Learning JP
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」
Keisuke Sugawara
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度
Seiichi Uchida
 
PRML読書会1スライド(公開用)
PRML読書会1スライド(公開用)PRML読書会1スライド(公開用)
PRML読書会1スライド(公開用)
tetsuro ito
 
Pythonで時系列のデータを分析してみよう
Pythonで時系列のデータを分析してみようPythonで時系列のデータを分析してみよう
Pythonで時系列のデータを分析してみよう
Tatuya Kobayashi
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜
Yuki Matsubara
 
PRML輪読#10
PRML輪読#10PRML輪読#10
PRML輪読#10
matsuolab
 
変分ベイズ法の説明
変分ベイズ法の説明変分ベイズ法の説明
変分ベイズ法の説明
Haruka Ozaki
 
混合モデルとEMアルゴリズム(PRML第9章)
混合モデルとEMアルゴリズム(PRML第9章)混合モデルとEMアルゴリズム(PRML第9章)
混合モデルとEMアルゴリズム(PRML第9章)
Takao Yamanaka
 
【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法
Ruo Ando
 
ベイズ統計学の概論的紹介
ベイズ統計学の概論的紹介ベイズ統計学の概論的紹介
ベイズ統計学の概論的紹介
Naoki Hayashi
 
Prml 1.2,4 5,1.3|輪講資料1120
Prml 1.2,4 5,1.3|輪講資料1120Prml 1.2,4 5,1.3|輪講資料1120
Prml 1.2,4 5,1.3|輪講資料1120
Hayato K
 

What's hot (20)

Overview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboostOverview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboost
 
2014.01.23 prml勉強会4.2確率的生成モデル
2014.01.23 prml勉強会4.2確率的生成モデル2014.01.23 prml勉強会4.2確率的生成モデル
2014.01.23 prml勉強会4.2確率的生成モデル
 
Prml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティングPrml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティング
 
TabNetの論文紹介
TabNetの論文紹介TabNetの論文紹介
TabNetの論文紹介
 
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.42013.12.26 prml勉強会 線形回帰モデル3.2~3.4
2013.12.26 prml勉強会 線形回帰モデル3.2~3.4
 
アンサンブル学習
アンサンブル学習アンサンブル学習
アンサンブル学習
 
PRML輪読#6
PRML輪読#6PRML輪読#6
PRML輪読#6
 
EMアルゴリズム
EMアルゴリズムEMアルゴリズム
EMアルゴリズム
 
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度
 
PRML読書会1スライド(公開用)
PRML読書会1スライド(公開用)PRML読書会1スライド(公開用)
PRML読書会1スライド(公開用)
 
Pythonで時系列のデータを分析してみよう
Pythonで時系列のデータを分析してみようPythonで時系列のデータを分析してみよう
Pythonで時系列のデータを分析してみよう
 
Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜Prml3.5 エビデンス近似〜
Prml3.5 エビデンス近似〜
 
PRML輪読#10
PRML輪読#10PRML輪読#10
PRML輪読#10
 
変分ベイズ法の説明
変分ベイズ法の説明変分ベイズ法の説明
変分ベイズ法の説明
 
混合モデルとEMアルゴリズム(PRML第9章)
混合モデルとEMアルゴリズム(PRML第9章)混合モデルとEMアルゴリズム(PRML第9章)
混合モデルとEMアルゴリズム(PRML第9章)
 
【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法【技術解説20】 ミニバッチ確率的勾配降下法
【技術解説20】 ミニバッチ確率的勾配降下法
 
ベイズ統計学の概論的紹介
ベイズ統計学の概論的紹介ベイズ統計学の概論的紹介
ベイズ統計学の概論的紹介
 
Prml 1.2,4 5,1.3|輪講資料1120
Prml 1.2,4 5,1.3|輪講資料1120Prml 1.2,4 5,1.3|輪講資料1120
Prml 1.2,4 5,1.3|輪講資料1120
 

Similar to Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2021)

V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
eMadrid network
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
Gunjan Kumar
 
Context-aware preference modeling with factorization
Context-aware preference modeling with factorizationContext-aware preference modeling with factorization
Context-aware preference modeling with factorization
Balázs Hidasi
 
GKumarAICS
GKumarAICSGKumarAICS
GKumarAICS
Gunjan Kumar
 
M033059064
M033059064M033059064
M033059064
ijceronline
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Object-Oriented Design Fundamentals.pptx
Object-Oriented Design Fundamentals.pptxObject-Oriented Design Fundamentals.pptx
Object-Oriented Design Fundamentals.pptx
RaflyRizky2
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
Sudeep Das, Ph.D.
 
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Daniel Davis
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)
민진 최
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
Steve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
Steve Feldman
 
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Dr. Cornelius Ludmann
 
00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...
Duke Network Analysis Center
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
JunKudo2
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
AnirbanBhar3
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Unicon, Inc.
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
Jaey Jeong
 

Similar to Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2021) (20)

V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
V Jornadas eMadrid sobre "Educación Digital". Cristina Conati, University of ...
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
Context-aware preference modeling with factorization
Context-aware preference modeling with factorizationContext-aware preference modeling with factorization
Context-aware preference modeling with factorization
 
GKumarAICS
GKumarAICSGKumarAICS
GKumarAICS
 
M033059064
M033059064M033059064
M033059064
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
 
Object-Oriented Design Fundamentals.pptx
Object-Oriented Design Fundamentals.pptxObject-Oriented Design Fundamentals.pptx
Object-Oriented Design Fundamentals.pptx
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
Enabling Real-Time Adaptivity in MOOCs with a Personalized Next-Step Recommen...
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
 
00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...00 Automatic Mental Health Classification in Online Settings and Language Emb...
00 Automatic Mental Health Classification in Online Settings and Language Emb...
 
Beyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networksBeyond data and model parallelism for deep neural networks
Beyond data and model parallelism for deep neural networks
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
Apereo Webinar: Learning What Works When Scaling Analytics Infrastructure (Ja...
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
 

Recently uploaded

_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices  Li..._Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices  Li...
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...
LucyHearn1
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
Sérgio Sacani
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills MN
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
Sérgio Sacani
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
Sérgio Sacani
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
ABHISHEK SONI NIMT INSTITUTE OF MEDICAL AND PARAMEDCIAL SCIENCES , GOVT PG COLLEGE NOIDA
 

Recently uploaded (20)

_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices  Li..._Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices  Li...
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
 

Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2021)

  • 1. Session-aware Linear Item-Item Models for Session-based Recommendation Minjin Choi1, Jinhong Kim1, Joonseok Lee2,3 , Hyunjung Shim4, Jongwuk Lee1 Sungkyunkwan University (SKKU)1, Google Research2 Seoul National University3, Yonsei University4
  • 3. Session-based Recommendation (SR) Predicting the next item(s) based on only the current session • Usually, session histories are much shorter than user histories. 3 User 6, January 13, January 17, February Anonymous 1 Anonymous 2 Anonymous 3
  • 4. Limitation of Existing SR Models DNN models show good performance but suffer scalability issues. • RNNs and GNNs are mostly used for session-based recommendations. • When the dataset is too large, the scalability issue arises. 4 Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016 GRU Cell GRU Cell GRU Cell GRU Cell
  • 5. Limitation of Existing SR Models Neighborhood-based models are difficult to capture complex dependencies between items in a session. 5 Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019 𝒔𝒊𝒎 = 𝟎. 𝟔 𝑠𝑖𝑚 = 0.4 Top-1 Neighbor session Target session
  • 6. Research Question 6 How to build the accurate and scalable session-based recommender model?
  • 7. Our Key Contributions We utilize linear models for session-based recommendations. • Linear models show great performance in traditional recommendations. • However, simply applying them to session-based recommendations does not show an effective performance. 7 Item-item model 𝑩 𝑩 ∈ ℝ𝒏×𝒏 User-item matrix 𝑿 𝑿 ∈ ℝ𝒎×𝒏 Dataset: YC-1/64 Models HR@20 MRR@20 GRU4Rec+ 0.6528 0.2752 SKNN 0.6423 0.2522 SEASER 0.5443 0.1963 Harald Steck, “Embarrassingly Shallow Autoencoders for Sparse Data”, WWW 2019 Worse than existing models! 𝐚𝐫𝐠𝐦𝐢𝐧 𝑩 𝑿 − 𝑿𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 s. t. 𝒅𝒊𝒂𝒈 𝑩 = 𝟎 𝒎: # of users 𝒏: # of items
  • 8. Our Key Contributions We design linear models to utilize some unique characteristics of sessions. • Some items tend to be consumed in a specific order. (dependency) • Items in a session are often highly coherent. (consistency) 8 6, January 13, January 17, February Session consistency (tops) Sequential dependency (pants -> shoes)
  • 9. Our Key Contributions We design linear models to utilize some unique characteristics of sessions. • Sessions often reflect a recent trend. (timeliness) • The user might consume the same items in a session. (repeated item) 9 6, January 13, January 17, February Repeated item consumption Timeliness of sessions
  • 11. Overview of the Proposed Model We devise two linear models to learn item similarity/transition and unify them to fully capture the complex relationships of items. 11 Full session representation Partial session representation Item similarity model Item transition model Item similarity/ transition model
  • 12. Two Session Representations We deal with two session representations for our linear models. • (1) A single vector to capture the correlations between items while ignoring the order. • (2) Past and future vectors to represent sequential dependency of items. 12 Full session representation Partial session representation past future
  • 13. Session-aware Item Similarity Model (SLIS) We learn the linear model using full session representation, to represent the similarity between items. • The entire session is represented as a single vector. • 𝐵𝑖𝑗 indicates the learned item similarity between 𝑖 and 𝑗. • Note: the diagonal constraint is relaxed. 13 Full session representation 𝑿 𝑿 ∈ ℝ|𝑺|×𝑵 Item similarity model 𝑩 𝑩 ∈ ℝ𝑵×𝑵 Dark cells mean high similarities. 𝐚𝐫𝐠𝐦𝐢𝐧 𝑩 𝑿 − 𝑿𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 s. t. 𝒅𝒊𝒂𝒈 𝑩 ≤ 𝝃 |𝑺|: # of sessions 𝒏: # of items
  • 14. Session-aware Item Transition Model (SLIT) We learn the linear model using partial session representation that captures the sequential dependency across items. • A session is divided into past vectors and future vectors. • 𝐵𝑖𝑗 indicates the item transitions from 𝑖 to 𝑗. 14 Item Transition Model 𝑩 𝑩 ∈ ℝ𝑵×𝑵 Partial session representation 𝑺𝑻 𝑺, 𝑻 ∈ ℝ|𝑺′|×𝑵 past future 𝐚𝐫𝐠𝐦𝐢𝐧 𝑩 𝑻 − 𝑺𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 Dark cells mean the high transition probability. |𝑺′|: # of partial sessions 𝒏: # of items
  • 15. Item Similarity/Transition Model: SLIST We unify two linear models by jointly optimizing both models. 15 Item similarity/transition model 𝑩 𝑩 ∈ ℝ𝑵×𝑵 Full session representation Partial session representation Item similarity model Item Transition Model
  • 16. Weight Decay of Items in a Session We assign higher weights for more recent sessions (𝒘𝒕𝒊𝒎𝒆) and current items (𝒘𝒑𝒐𝒔, 𝒘𝒊𝒏𝒇) in the sessions. 16 Full and partial session representation 𝒘𝒕𝒊𝒎𝒆 𝒐𝒓𝒅𝒆𝒓 𝒘𝒑𝒐𝒔 past future 𝒕𝒊𝒎𝒆
  • 17. Detail: Training and Inference The objective function of SLIST Model inference 17 ⋅ 𝒕 = 𝒔 ⋅ 𝑩 Partial representation vector Learned item-item model 𝒂𝒓𝒈𝒎𝒊𝒏 𝑩 𝜶 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩 𝑭 𝟐 + 𝟏 − 𝜶 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 𝑩 = 𝑰 − 𝝀𝑷 − 𝟏 − 𝜶 𝑷𝑺⊤𝒅𝒊𝒂𝒈𝑴𝒂𝒕(𝒘𝒑𝒂𝒓 𝟐 )(𝑺 − 𝑻) Closed form solution Item similarity model (SLIS) Item transition model (SLIT) 𝒘𝒊𝒏𝒇 𝒐𝒓𝒅𝒆𝒓
  • 19. Experimental Setup: Dataset We evaluate the accuracy and scalability of SLIST over public datasets. • We use both 5-split and 1-split datasets. 19 Split Dataset # of actions # of sessions # of items 1-split YooChoose 1/64 (YC-1/64) 494,330 119,287 17,319 YooChoose 1/4 (YC-1/4) 7,909,307 1,939,891 30,638 DIGINETICA (DIGI1) 916,370 188,807 43,105 5-split YooChoose (YC5) 5,426,961 1,375,128 28,582 DIGINETICA (DIGI5) 203,488 41,755 32,137
  • 20. Evaluation Protocol and Metrics Evaluation protocol: iterative revealing scheme • We iteratively expose the item of a session to the model. Evaluation metrics • HR@20 and MRR@20 • To predict only the next item in a session • R@20 and MAP@20 • To consider all subsequent items for a session 20 Time Session history: (1) Input & target: (2) Input & target: (4) Input & target: …
  • 21. Competitive Models Two neighborhood-based models • SKNN: a session-based KNN algorithm, a variant of user-based KNN. • STAN: an improved version of SKNN by considering sequential dependency and timeliness of sessions. Four DNN-based models • GRU4Rec+: an improved version of GRU4REC using the top-k gain. • NARM: an improved version of GRU4REC+ using an attention mechanism. • STAMP: an attention-based model for capturing user’s interests. • SR-GNN: a GNN-based model to capture complex dependency. 21 Dietmar Jannach et. al., “When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation”, RecSys 2017 Diksha Garg et. al., “Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN”, SIGIR 2019 Balázs Hidasi et. al., “Session-based Recommendations with Recurrent Neural Networks”, ICLR 2016. Jing Li et. al., “Neural Attentive Session-based Recommendation” CIKM 2017. Qiao Liu et. al., “STAMP: ShortTerm Attention/Memory Priority Model for Session-based Recommendation” KDD 2018. Shu Wu et. al., “Session-Based Recommendation with Graph Neural Networks”, AAAI 2019.
  • 22. Accuracy: Ours vs. Competing Models 22 Our models consistently show competitive performance. • No existing models show state-of-the-art performance for all datasets. Dataset Metric SKNN STAN GRU4Rec+ NARM STAMP SR-GNN SLIS SLIT SLIST Gain(%) YC-1/64 HR@20 0.6423 0.6838 0.6528 0.6998 0.6841 0.7021 0.7015 0.6968 0.7088 0.95 R@20 0.4780 0.4955 0.4009 0.5051 0.4904 0.5049 0.5051 0.4976 0.5080 0.57 YC-1/4 HR@20 0.6329 0.6846 0.6940 0.7079 0.7021 0.7118 0.7119 0.7149 0.7175 0.80 R@20 0.4756 0.4952 0.4887 0.5097 0.5008 0.5095 0.5121 0.5110 0.5130 0.65 DIGI1 HR@20 0.4846 0.5121 0.5097 0.4979 0.4690 0.4904 0.5247 0.4729 0.5291 3.32 R@20 0.3788 0.3965 0.3957 0.3890 0.3663 0.3811 0.4062 0.3651 0.4091 3.18 YC5 HR@20 0.5996 0.6656 0.6488 0.6751 0.6654 0.6713 0.6786 0.6838 0.6867 1.72 R@20 0.4658 0.4986 0.4837 0.5109 0.4979 0.5060 0.5074 0.5097 0.5122 0.25 DIGI5 HR@20 0.4748 0.4800 0.4639 0.4188 0.3917 0.4158 0.4939 0.4193 0.5005 4.27 R@20 0.3715 0.3720 0.3617 0.3254 0.3040 0.3232 0.3867 0.3260 0.3898 4.78 Neighborhood-based DNN-based Ours
  • 23. Scalability: Ours vs. Competing Models Training SLIST is much faster than training DNN-based models. • Computational complexity is mainly proportional to the number of items. • It is highly desirable in practice, especially on popular online services, where millions of users create billions of session data every day. 23 Models The ratio of training sessions on YC-1/4 5% 10% 20% 50% 100% GRU4Rec+ 177.4 317.8 614.0 1600.2 3206.9 NARM 2137.6 3431.2 7454.2 27804.5 72076.8 STAMP 434.1 647.5 985.3 2081.2 4083.3 SR-GNN 6780.5 18014.7 31444.3 68581.9 185862.5 SLIST 202.7 199.0 199.9 227.0 241.9 Gains (SLIST vs. SR-GNN) 33.5x 90.6x 157.3x 302.1x 768.2x
  • 24. Ablation Study: Weight components SLIST with all components is always better than others. • For three datasets, considering recent items at the inference phase (𝑤𝑖𝑛𝑓) is the most influential factor. 24 𝒘𝒊𝒏𝒇 𝒘𝒑𝒐𝒔 𝒘𝒕𝒊𝒎𝒆 YC-1/64 YC-1/4 DIGI1 HR MRR HR MRR HR MRR O 0.6764 0.2697 0.6784 0.2723 0.5111 0.1817 O 0.6898 0.2980 0.6947 0.3005 0.5173 0.1852 O 0.6570 0.2651 0.6675 0.2693 0.5055 0.1789 O O 0.7078 0.3077 0.7096 0.3115 0.5253 0.1880 O O 0.6761 0.2697 0.6841 0.2753 0.5145 0.1820 O O 0.6880 0.2973 0.7004 0.3031 0.5202 0.1855 0.6592 0.2652 0.6626 0.2671 0.5025 0.1786 O O O 0.7088 0.3083 0.7175 0.3161 0.5291 0.1886 All Weights Two Weights One Weights No Weights
  • 26. Conclusion We propose session-aware linear item models to fully capture various characteristics of sessions. • SLIST: session-aware linear similarity/transition model Despite its simplicity, SLIST achieves competitive or state-of-the-art performance on various datasets. SLIST is highly scalable thanks to closed-form solutions of the linear models. • The computational complexity of SLIST is proportional to the number of items, which is desirable in commercial systems. 26
  • 29. Proof of Closed-form Solutions Session-aware Linear Item Similarity Model (SLIS) 29 ℒ 𝑩 = 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 𝒔. 𝒕. 𝒅𝒊𝒂𝒈 𝑩 ≤ 𝝃 ℒ 𝑩, 𝝁 = 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 + 𝟐𝝁⊤𝒅𝒊𝒂𝒈(𝑩 − 𝝃𝑰) 𝝏ℒ 𝑩, 𝝁 𝝏𝑩 = 𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 + 𝝀𝑰 𝑩 − 𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 + 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝝁 𝑩 = 𝑰 − 𝝀𝑷 − 𝑷 ⋅ 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝝁 𝑫𝒇𝒖𝒍𝒍 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒇𝒖𝒍𝒍 𝟐 𝑷 = 𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 + 𝝀𝑰 −𝟏
  • 30. Proof of Closed-form Solutions Session-aware Linear Item Transition Model (SLIT) 30 ℒ 𝑩 = 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 𝝏ℒ 𝑩 𝝏𝑩 = 𝑺⊤ 𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 𝑩 − 𝑺⊤ 𝑫𝒑𝒂𝒓𝑻 𝑩 = 𝑷 ⋅ (𝑺⊤𝑫𝒑𝒂𝒓𝑻) 𝑫𝒑𝒂𝒓 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒑𝒂𝒓 𝟐 𝑷 = 𝑺⊤𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 −𝟏
  • 31. Proof of Closed-form Solutions Session-aware Linear Item Similarity/Transition Model (SLIST) 31 ℒ 𝑩 = 𝜶 𝑾𝒇𝒖𝒍𝒍 ⊙ 𝑿 − 𝑿𝑩 𝑭 𝟐 + (𝟏 − 𝜶) 𝑾𝒑𝒂𝒓 ⊙ 𝑻 − 𝑺𝑩 𝑭 𝟐 + 𝝀 𝑩 𝑭 𝟐 𝝏ℒ 𝑩 𝝏𝑩 = 𝜶𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 + (𝟏 − 𝜶)𝑺⊤ 𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 𝑩 − 𝜶𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 − (𝟏 − 𝜶)𝑺⊤ 𝑫𝒑𝒂𝒓𝑻 𝑩 = 𝑰 − 𝑷𝝀 − (𝟏 − 𝜶)𝑷𝑺⊤ 𝑫𝒑𝒂𝒓(𝑺 − 𝑻) 𝑫𝒇𝒖𝒍𝒍 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒇𝒖𝒍𝒍 𝟐 , 𝑫𝒑𝒂𝒓 = 𝒅𝒊𝒂𝒈𝑴𝒂𝒕 𝒘𝒑𝒂𝒓 𝟐 𝑷 = 𝜶𝑿⊤ 𝑫𝒇𝒖𝒍𝒍𝑿 + (𝟏 − 𝜶)𝑺⊤ 𝑫𝒑𝒂𝒓𝑺 + 𝝀𝑰 −𝟏

Editor's Notes

  1. Let me first introduce the motivation for this work.
  2. Let me first introduce the motivation for this work.