Unblocking The Main Thread Solving ANRs and Frozen Frames
Β
Modeling Implicit Signals and Dynamic Preferences for Sequential Recommendation
1. Ho-Beom Kim
Network Science Lab
Dept. of Mathematics
The Catholic University of Korea
E-mail: hobeom2001@catholic.ac.kr
2023 / 07 / 05
WANG, Xiang, et al.
2019, ACM SIGIR
2. 2
Introduction
Problem Statements
β’Existing works
β’ adopt human designed rules or attention mechanism to assign time-decaying weights to
historically interacted items.
β’ leverages recurrent neural networks to summarize the behavioral sequences, but they suffer from
the short-term botleneck in capturing usersβdynamic interests due to the difficulty of modeling
long-range dependencies.
β’Recent Solutions
β’ jointly model long-term and short-term interests to avoid forgetting long-term interests
β’Two major challenges in sequential recommendation that have no been well-addressed
β’ User behaviors in long sequences reflect implicit and noisy preference signals
β’ implicit feedback records will serve as noises in the userβs behavior history, worsening the
modeling of their real interests.
β’ User preferences are always drifting over time due to their diversity
β’ It is still challengign to model how they change in the history and estimate the activated
preferences at the current time, which is the core of recommendation models
3. 3
Introduction
Contributions
β’They approach sequential recommendation from a new perspective by taking into consideration the
implicit-signal behaviors and fast-changing preferences
β’They propose to aggregate implicit signals into explicit ones from user behaviors by designing graph
neural network-based models on constructed item-item interest graphs.
β’They design dynamic-pooling for filtering and reserving activated core preferences for recommendation
β’They conduct extensive experiments on two large-scale datasets collected from real-world applications.
β’The experimental results show significant performance improvements compared with the state-of-the-art
methods of sequential recommendation.
β’Studies also verify that our method can model long behavioral sequences effectively and efficiently
5. 5
Method
Interest Graph Construction
β’They convert loose item sequences into tight item-item interest graphs.
β’Raw graph costruction
β’This model attempts to construct an undirected graph for each interaction sequence
β’π’ = {V, β°, A}
β’β° : the set of graph edges to learn
β’A : the coresponds adjacency matrix
β’V : an interacted item
β’They aim to learn the adjacency matrix A
β’ The core interest node has a higher degree than the peripheral interest node due to connecting more
similar interests, and the higher frequency of similar interests results in a denser and larger subgraph.
β’ A priori framework, neighbor nodes are similar, dense subgraphs are the core interests of users.
6. 6
Method
Node similarity metric learning
β’πππ = cos π€β¨βπ, π€β¨βπ
β’π€ : trainable weight vector
β’βπ : item embeddings
β’πweight vectors to compute π independent similarity matrices
β’πππ
πΏ
= cos π€β¨βπ, π€β¨βπ , πππ =
1
πΏ πΏ=1
π
πππ
πΏ
β’πππ
πΏ
: the similarity metric for the πΏ-th head
7. 7
Method
Graph sparsification via π -sparseness
β’They extract the symmetric sparse non-negative adjacency matrix A from M by considering only the
node pair with the most vital connection.
β’They adopt a relative ranking strategy of the entire graph
β’π΄ππ =
1, πππ >=π πππβ°π2 π ;
0, ππ‘βπππ€ππ π,
β’π πππβ°π2 π : returns the value of theβ°π2
-th largest value in the metric matrix M
β’N : the number of nodes
β’β° : control the overall sparsity of the generated graph
8. 8
Method
Interest-fusion Graph Convolutional Layer
They have learnable interest graphs which separate diverse interests.
The core interests and peripheral interests form large clusters and small clusters respectively, and different
types of interests form different clusters. To gather weak signals to strong ones that can accurately reflect
user preferences, they need to aggregate information in the constructed graph
9. 9
Method
Interest fusion via graph attentive convolution
β’ βπ : The input (node, I is the number of nodes)
β’ βπ
β²
: a new node embedding matrix after the layer
β’ πΈππ : An alignment score to map the importance of target node on itβs neighbor node
β’ βπ
β²
= π(π
π β π΄πππππππ‘π(πΈππ β βπ π β ππ + βπ)
β’ βπ
β²
= π(π
π
πΏ
β π΄πππππππ‘π(πΈππ
πΏ
β βπ π β ππ + βπ)
10. 10
Method
Cluster-and query-aware attention
β’ ππ = π΄π‘π‘πππ‘πππ(π
πβπ|| βππ
|| Wπβ¨βππ
)
β’ π΄π‘π‘πππ‘πππ : a two-layers feedforward neural network with the πΏππππ¦π ππΏπ as activation function
β’ Ξπ = π΄π‘π‘πππ‘πππ(π
πβπ|| βπ‘ || Wπβπβ¨βπ‘)
β’ π΄π‘π‘πππ‘πππ : a two-layers feedforward neural network applying the LeakyReLU nonlinearity
β’ πΈππ = π πππ‘πππ₯π ππ + π½π =
exp ππ+π½π
πβππ
exp(ππ+π½π)
11. 11
Method
Interest-extraction Graph Pooling Layer
The fusion of implicit interest signals to explicit interest signals is completed by performing information
aggregation on the interest graph
They use graph pooling to further extract the fused information
Aims to downsize the graph β loose interest is transformed into tight interest and its distribution is
maintained
12. 12
Method
Interest extraction via graph pooling
β’ As summing a soft cluster assignment matrix S, it can pool node information into cluster information
β’ β1
β
, β2
β
, β¦ , βπ
β
= ππ β1
β²
, β2
β²
, β¦ , βπ
β²
β’ πΎ1
β
, πΎ2
β
, β¦ , πΎπ
β
= ππ
πΎ1, πΎ2, β¦ , πΎπ
β’ πΎπ : applying sofrmax on π½π
β’ ππ: = π πππ‘πππ₯(π
π β π΄πππππππ‘π(π΄ππ β βπ
β²
|π β ππ))