SlideShare a Scribd company logo
1 of 94
Download to read offline
A Topology-aware
Analysis of Graph
Collaborative Filtering
Daniele Malitesta, PhD
SisInfLab Research Group
Politecnicodi Bari (Bari, Italy)
The Universityof Glasgow (Glasgow, UK)
50 minutes
February 05, 2024 (15.00 UK Time)
IR Talks
TABLE OF CONTENTS
01
05
03
06
04
07
INTRODUCTION &
BACKGROUND
TOPOLOGICAL
CHARACTERISTICS
TOPOLOGY IN
GRAPH CF
PROPOSED
ANALYSIS
RESULTS TAKE-HOME
MESSAGES
02
MOTIVATIONS
USEFUL RESOURCES
The content of the following slides is taken from:
β–  Daniele Malitesta, Claudio Pomo, Vito Walter Anelli, Alberto Carlo
Maria Mancino, Eugenio Di Sciascio, Tommaso Di Noia:
A Topology-aware Analysis of Graph Collaborative Filtering. CoRR
abs/2308.10778 (2023)
β–  Daniele Malitesta, Claudio Pomo, Tommaso Di Noia: Graph Neural
Networks for Recommendation: Reproducibility, Graph Topology,
and Node Representation. CoRR abs/2310.11270 (2023)
Scan me!
INTRODUCTION &
BACKGROUND
01
A RISING INTEREST
0
100
200
300
400
500
600
2020 2021 2022 2023
Papers
As searched on DBLP through the keywords β€œgraph” and β€œrecommend”.
PREVIOUS TUTORIALS ON GNNs-RECSYS
A NON-EXHAUSTIVE TIMELINE
Pioneer approaches
proposing GCN-based
aggregation methods
[v an den Berg et al.,
Ying et al.]
Explore inter-dependencies
between nodes and their
neighbours [Wang et al.
(2019a)], use graph attention
networks to recognize
meaningful user-item
interactions at higher-grained
level [Wang et al. (2019b)]
Lighten the graph
convolutional layer [Chen
e t al., He et al.], use graph
attention networks to
recognize meaningful user-
item interactions at higher-
grained level [Wang et al.
(2020b), Tao et al.]
Self-supervised and
contrastive learning
[Wu et al., Yu et al.]
Simplify the message-
passing formulation
[Mao et al., Peng et al.,
Shen et al.]
Explore other latent
spaces [Shen et al., Sun
e t al., Zhang et al.]
Exploit hypergraphs
[We i et al., Xia et al.]
Use graph attention networks
to recognize meaningful user-
item interactions at finer-
grained [Zhang et al.] level
2017-2018
2019
2020
2021-2022
2021-2022
2021-2022
2022
2022
USER-ITEM GRAPH
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
1 0 1 1 1 0
0 0 0 0 1 0
1 1 1 0 0 0
0 0 1 1 0 1
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
user-item
interaction matrix
0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1
1 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
adjacency matrix
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
𝑒1 𝑒2 𝑒3 𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
USER-ITEM GRAPH
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
1 0 1 1 1 0
0 0 0 0 1 0
1 1 1 0 0 0
0 0 1 1 0 1
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
user-item
interaction matrix
0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1
1 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
adjacency matrix
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
𝑒1 𝑒2 𝑒3 𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
BIPARTITE
USER-ITEM GRAPH
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
1 0 1 1 1 0
0 0 0 0 1 0
1 1 1 0 0 0
0 0 1 1 0 1
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
user-item
interaction matrix
0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1
1 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
adjacency matrix
𝑒1
𝑒2
𝑒3
𝑒4
𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
𝑒1 𝑒2 𝑒3 𝑒4
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
UNDIRECTED
MESSAGE-PASSING & LATENT-FACTORS
MESSAGE-PASSING(LAYER1)
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
(οΏ½)
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 0 1
1 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
οΏ½
( )
( )
, =
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
FACTORIZATION & MESSAGE-PASSING
1 0 1 1 1 0
0 0 0 0 1 0
1 1 1 0 0 0
0 0 1 1 0 1
< >
, β‰ˆ
οΏ½
οΏ½
οΏ½
οΏ½
οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
οΏ½
( )
Message-passing component Latent-factors component
MOTIVATIONS
02
Topologically
WHY GRAPH CF WORKS WELL?
β€’ The machine learning literature
acknowledges that graph topology
plays a crucial role in GNNs [Wei et
al., Castellana and Errica].
β€’ For instance, revisit network
topology to enhance the learning
ability of GCNs [Shi et al.].
β€’ Established approaches such as
LightGCN and DGCF adapt the GCN
layer to suit the CF rationale.
β€’ Recent graph-based RSs including
UltraGCN and SVD-GCN go beyond
the concept of message-passing.
Algorithmically
[Adomavicius andZhang, Deldjoo et al.]
Classical dataset characteristics
such as dataset sparsity may
impact the performance of
recommendation models in
diverse scenarios and tasks.
The topological nature of the user-
item data and the new graph CF wave
make it imperative to unveil the
dependencies between topological
properties and recommendation
performance.
3) Build an explanatory model
OUR CONTRIBUTIONS
that studies the impact of classical and
topological dataset characteristics on the
performance of SOTA graph CF.
which are recent and across-the-board in
the literature.
to calculate the linear dependences among
characteristics and recommendation
metrics.
on their statistical significance, with
varying settings of graph samplings.
2) Select 4 graph CF approaches
4) Validate the explanations
1) First analysis in the literature
TOPOLOGICAL
CHARACTERISTICS
03
NODE DEGREE EXTENSION (1/3)
DEFINITION
Let 𝑁𝑒
(𝑙)
and 𝑁𝑖
(𝑙)
be the sets of neighborhood nodes for user 𝑒 and item 𝑖 at 𝑙 distance hops. The extended
definition of node degrees for 𝑒 and 𝑖 is:
πœŽπ‘’ = 𝑁𝑒
(1)
, πœŽπ‘– = 𝑁𝑖
(1)
.
The average user and item node degrees on the whole user and item sets (π‘ˆ and 𝐼):
πœŽπ‘ˆ =
1
π‘ˆ
Οƒπ‘’βˆˆπ‘ˆ 𝑁𝑒
1
, 𝜎𝐼 =
1
𝐼
Οƒπ‘–βˆˆπΌ 𝑁𝑖
(1)
.
NODE DEGREE EXTENSION (2/3)
RECSYS RE-INTERPRETATION
The node degree in the user-item graph stands for the number of items (users) interacted by a user
(item). This is related to the cold-start issue in recommendation, where cold users denote low activity on
the platform, while cold items are niche products.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
πœŽπ‘’2
= 5 (active user)
πœŽπ‘’7
= 1 (non-active user)
NODE DEGREE EXTENSION (3/3)
RECSYS RE-INTERPRETATION
The node degree in the user-item graph stands for the number of items (users) interacted by a user
(item). This is related to the cold-start issue in recommendation, where cold users denote low activity on
the platform, while cold items are niche products.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
πœŽπ‘–7
= 4 (popular item)
πœŽπ‘–4
= 1 (niche item)
CLUSTERING COEFFICIENT (1/5)
DEFINITION
Let 𝑣 and 𝑀 be two nodes from the same partition (e.g., user nodes). Their similarity is the intersection
over union of their neighborhoods. By evaluating the metric node-wise:
𝛾𝑣 =
Οƒ
π‘€βˆˆπ‘π‘£
2 𝛾𝑣,𝑀
𝑁𝑣
2 with 𝛾𝑣,𝑀 =
𝑁𝑣
1
βˆ©π‘π‘€
1
𝑁𝑣
1
βˆͺ𝑁𝑀
1 ,
where 𝑁𝑣
(2)
is the second-order neighborhood set of 𝑣. The average clustering coefficient [Latapy et al.]
on π‘ˆ and 𝐼 is:
π›Ύπ‘ˆ =
1
π‘ˆ
Οƒπ‘’βˆˆπ‘ˆ 𝛾𝑒, 𝛾𝐼 =
1
𝐼
Οƒπ‘–βˆˆπΌ 𝛾𝑖.
CLUSTERING COEFFICIENT (2/5)
RECSYS RE-INTERPRETATION
High values of the clustering coefficient indicate that there exists a substantial number of co-
occurrences among nodes of the same partition. For instance, user-wise, the value increases if several
users share most of their interacted items. This intuition aligns with the rationale of collaborative
filtering: two users are likely to share similar preferences if they interact with the same items.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝛾𝑒3,𝑒5
=
𝑖5,𝑖7 ∩ 𝑖5,𝑖6,𝑖7
𝑖5,𝑖6,𝑖7
=
2
3
= 0.667
CLUSTERING COEFFICIENT (3/5)
RECSYS RE-INTERPRETATION
High values of the clustering coefficient indicate that there exists a substantial number of co-
occurrences among nodes of the same partition. For instance, user-wise, the value increases if several
users share most of their interacted items. This intuition aligns with the rationale of collaborative
filtering: two users are likely to share similar preferences if they interact with the same items.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝛾𝑒6,𝑒7
=
𝑖7,𝑖8 ∩ 𝑖8
𝑖7,𝑖8
=
1
2
= 0.500
CLUSTERING COEFFICIENT (4/5)
RECSYS RE-INTERPRETATION
High values of the clustering coefficient indicate that there exists a substantial number of co-
occurrences among nodes of the same partition. For instance, user-wise, the value increases if several
users share most of their interacted items. This intuition aligns with the rationale of collaborative
filtering: two users are likely to share similar preferences if they interact with the same items.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝛾𝑒2,𝑒4
=
𝑖1,𝑖2,𝑖3,𝑖4,𝑖5 ∩ 𝑖3,𝑖7
𝑖1,𝑖2,𝑖3,𝑖4,𝑖5,𝑖7
=
1
6
= 0.167
CLUSTERING COEFFICIENT (5/5)
RECSYS RE-INTERPRETATION
High values of the clustering coefficient indicate that there exists a substantial number of co-
occurrences among nodes of the same partition. For instance, user-wise, the value increases if several
users share most of their interacted items. This intuition aligns with the rationale of collaborative
filtering: two users are likely to share similar preferences if they interact with the same items.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝛾𝑒1,𝑒5
=
𝑖1,𝑖2 ∩ 𝑖5,𝑖6,𝑖7
𝑖1,𝑖2,𝑖5,𝑖6,𝑖7
=
0
6
= 0.000
DEGREE ASSORTATIVITY (1/3)
DEFINITION
Let 𝐷 = {𝑑1,𝑑2, …} be the set of unique node degrees in the graph, and let π‘’π‘‘β„Ž,π‘‘π‘˜
be the fraction of edges
connecting nodes with degrees π‘‘β„Ž and π‘‘π‘˜. Then, let π‘žπ‘‘β„Ž
be the probability distribution to choose a node
with degree π‘‘β„Ž after having selected a node with the same degree (i.e., the excess degree distribution).
The degree assortativity coefficient [M. E. J. Newman] is:
𝜌 =
Οƒπ‘‘β„Žπ‘‘π‘˜
π‘‘β„Žπ‘‘π‘˜ π‘’π‘‘β„Ž,π‘‘π‘˜
βˆ’ π‘žπ‘‘β„Ž
π‘žπ‘‘π‘˜
π‘ π‘‘π‘‘π‘ž
2
,
where π‘ π‘‘π‘‘π‘ž is the standard deviation of the distribution π‘ž.
DEGREE ASSORTATIVITY (2/3)
RECSYS RE-INTERPRETATION
The degree assortativity calculated user- and item-wise is a proxy to represent the tendency of users
with the same activity level on the platform and items with the same popularity to gather, respectively.
To visualize the degree assortativity, we need the projected user-user and item-item graphs.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
user-item graph user-user projected graph
DEGREE ASSORTATIVITY (3/3)
RECSYS RE-INTERPRETATION
The degree assortativity calculated user- and item-wise is a proxy to represent the tendency of users
with the same activity level on the platform and items with the same popularity to gather, respectively.
To visualize the degree assortativity, we need the projected user-user and item-item graphs.
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝜌 = βˆ’0.191
(high degree dis-assortativity)
Density
CLASSICAL DATASET CHARACTERISTICS
Estimates the number of all possible
interactions:
𝜁 = π‘ˆπΌ.
Defines the ratio between the number
of users and items:
πœ‹ =
π‘ˆ
𝐼
.
Measures the ratio of actual user-item
interactions to all possible interactions:
𝛿 =
𝐸
π‘ˆπΌ
.
Calculates the interactions’
concentration for users (items):
πœ…π‘ˆ =
σ𝑒=1
π‘ˆβˆ’1 σ𝑣=𝑒+1
π‘ˆ π‘Žπ‘π‘  πœŽπ‘’βˆ’πœŽπ‘£
π‘ˆ σ𝑒=1
π‘ˆ πœŽπ‘’
.
Shape
Gini coefficient
Space size
TOPOLOGY IN
GRAPH CF
04
UltraGCN (CIKM 2021)
SELECTED GRAPH CF APPROACHES
Node degree used to normalize the
adjacency matrix in the message-passing.
Node degree used to normalize the
adjacency matrix in the message-
passing.
Node degree used for normalization in
the infinite layer message-passing. The
model also learns from the item-
projected graph.
Node embeddings involve the largest
singular values of the normalized
user-item interaction matrix, whose
maximum value is related to the
maximum node degree of the user-
item graph. The model learns from
user- and item-projected graphs.
DGCF (SIGIR 2020)
SVD-GCN (CIKM 2022)
LightGCN (SIGIR 2020)
The selected graph CF approaches
explicitly utilize the node degree in
the representation learning. However,
clustering coefficient and degree
assortativity do not have an evident
representation in the formulations.
β€’ Which topological aspects graph-based
models can (un)intentionally capture?
β€’ Are (topological) dataset characteristics
influencing the recommendation
performance of graph CF models?
05
PROPOSED
ANALYSIS
Our goal is to understand whether there
exist dependencies among (topological)
dataset characteristics and the performance
of graph-based recommendation models.
To this aim, we decide to build and fit an
explanatory framework.
EXPLANATORY FRAMEWORK: STEPS
EXPLANATORY FRAMEWORK: STEPS
STEP 1
Select models
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2
Select characteristics
Select models
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3
Select datasets
Select characteristics
Select models
DATASETS
# Users # Items # Interactions
Yelp-2018 25,677 25,815 696,865
Amazon -Book 70,679 24,915 846,434
Gowalla 29,858 40,981 1,027,370
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3
Select characteristics
Select models Select datasets
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
Select characteristics
Select models Select datasets
SUB-DATASETS GENERATION
NODE DROPOUT
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
users
items
NODE DROPOUT
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
users
items
NODE DROPOUT
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
𝑒1
𝑒2
𝑒3
𝑒7
𝑖2
𝑖5
𝑖6
𝑖7
𝑖8
users
items
users
items
EDGE DROPOUT
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
users
items
EDGE DROPOUT
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2
𝑖3
𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
users
items
𝑒1
𝑒2
𝑒3
𝑒4
𝑒5
𝑒6
𝑒7
𝑖1
𝑖2 𝑖4
𝑖5
𝑖6
𝑖7
𝑖8
users
items
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
Select characteristics
Select models Select datasets
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5
Get datasets’ statistics
Select characteristics
Select models Select datasets
STATISTICS CALCULATION (1/2)
STATISTICS CALCULATION (2/2)
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5
Get datasets’ statistics
Select characteristics
Select models Select datasets
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5 STEP 6
Train models
Select characteristics
Select models Select datasets
Get datasets’ statistics
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5 STEP 6
Train models
STEP 7
Evaluate models
Select characteristics
Select models Select datasets
Get datasets’ statistics
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5 STEP 6
Train models
STEP 7
Evaluate models
STEP 8
Generate explanations
Select characteristics
Select models Select datasets
Get datasets’ statistics
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
Predicted
performance
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
Predicted
performance
Prediction
error
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
Predicted
performance
Prediction
error
Global
bias
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
Predicted
performance
Prediction
error
Global
bias
Weight of
characteristic c
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
Predicted
performance
Prediction
error
Global
bias
Weight of
characteristic c
Characteristic c
for all samples
GENERATE (LINEAR) EXPLANATIONS
Characteristics
Performance
We aim to fit the following multivariate linear
regression model:
π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
We use the ordinary least squares (OLS)
optimization model:
πœƒ0
βˆ—,πœ½π‘
βˆ— = min
πœƒ0,πœƒπ‘
1
2
π’š βˆ’ πœƒ0 βˆ’ πœ½π‘π‘Ώπ‘ 2
2
.
EXPLANATORY FRAMEWORK: STEPS
STEP 1 STEP 2 STEP 3 STEP 4
Generate sub-datasets
STEP 5 STEP 6
Train models
STEP 7
Evaluate models
STEP 8
Generate explanations
Select characteristics
Select models Select datasets
Get datasets’ statistics
RESULTS
06
IMPACT OF CHARACTERISTICS
Characteristics’
weights
[πœƒ0,πœƒ1,…, πœƒπ‘]
IMPACT OF CHARACTERISTICS
How good is the
linear regression
at making
predictions?
𝑅2
is, for many settings, above 95%.
The regression model can provide good explanations.
IMPACT OF CHARACTERISTICS
classical
IMPACT OF CHARACTERISTICS
inverse
Graph CF = neighborhood + latent factors.
Graph CF behaves as factorization-based models.
direct
IMPACT OF CHARACTERISTICS
topological
IMPACT OF CHARACTERISTICS
direct
This confirms what observed in the models’ formulations.
High users’ interactions correspond to better performance.
IMPACT OF CHARACTERISTICS
strong
direct
Values for users- and items-side are more flattened for
UltraGCN and SVD-GCN.
IMPACT OF CHARACTERISTICS
LightGCN and DGCF have larger coefficients that SVD-GCN,
because they explicitly propagate messages.
direct
inverse
IMPACT OF CHARACTERISTICS
UltraGCN has bigger coefficients, probably because it uses
the infinite-layer message passing.
direct
inverse
Probability distribution of node
degrees on Gowalla
In the worst-case scenario, node-
dropout drops many high-degree
nodes; edge-dropout, drops all the
edges connected to several nodes and
thus disconnect them from the graph.
Is this undermining the goodness of the
proposed explanatory framework?
INFLUENCE OF NODE- AND EDGE-DROPOUT
NODE
100% [Setting A] 100% node-dropout
INFLUENCE OF NODE- AND EDGE-DROPOUT
NODE
EDGE
70%
30%
[Setting B] 70% node-dropout
30% edge-dropout
INFLUENCE OF NODE- AND EDGE-DROPOUT
NODE
EDGE
30%
70% [Setting C] 30% node-dropout
70% edge-dropout
INFLUENCE OF NODE- AND EDGE-DROPOUT
EDGE
100% [Setting D] 100% edge-dropout
INFLUENCE OF NODE- AND EDGE-DROPOUT
Datasets’ statistics
Node-dropout retains smaller portions of the dataset than
edge-dropout.
INFLUENCE OF NODE- AND EDGE-DROPOUT
Meaningful learned dependencies (this setting is like ours).
INFLUENCE OF NODE- AND EDGE-DROPOUT
Not-statistically-significant at the extremes.
TAKE-HOME
MESSAGES
07
WHAT WE HAVE LEARNED
01
LATENT-FACTORS vs.
NEIGHBORHOOD
WHAT WE HAVE LEARNED
01 02
LATENT-FACTORS vs.
NEIGHBORHOOD
NODE DEGREE
WHAT WE HAVE LEARNED
01 02 03
LATENT-FACTORS vs.
NEIGHBORHOOD
NODE DEGREE CLUSTERING
COEFFICIENT
WHAT WE HAVE LEARNED
01
04
02 03
LATENT-FACTORS vs.
NEIGHBORHOOD
NODE DEGREE CLUSTERING
COEFFICIENT
DEGREE ASSORTATIVITY
WHAT WE HAVE LEARNED
01
04
02
05
03
LATENT-FACTORS vs.
NEIGHBORHOOD
NODE DEGREE CLUSTERING
COEFFICIENT
DEGREE ASSORTATIVITY NODE- vs. EDGE-
DROPOUT
REFERENCES (1/3)
[Malitesta et al.] A Topology-aware Analysis of Graph Collaborative Filtering. CoRR abs/2308.10778
(2023)
[Malitesta et al.] Graph Neural Networks for Recommendation: Reproducibility, Graph Topology, and
Node Representation. CoRR abs/2310.11270 (2023)
[Wang et al. (2020a)] Learning and Reasoning on Graph for Recommendation. WSDM 2020: 890-893
[El-Kishky et al.] Graph-based Representation Learning for Web-scale Recommender Systems. KDD
2022: 4784-4785
[Gao et al.] Graph Neural Networks for Recommender System. WSDM 2022: 1623-1625
[Purificato et al. (2023a)] Tutorial on User Profiling with Graph Neural Networks and Related Beyond-
Accuracy Perspectives. UMAP 2023: 309-312
[Purificato et al. (2023b)] Leveraging Graph Neural Networks for User Profiling: Recent Advances and
Open Challenges. CIKM 2023: 5216-5219
[van den Berg et al.] Graph Convolutional Matrix Completion. CoRR abs/1706.02263 (2017)
[Ying et al.] Graph Convolutional Neural Networks for Web-Scale Recommender Systems. KDD 2018:
974-983
[Wang et al. (2019a)] Neural Graph Collaborative Filtering. SIGIR 2019: 165-174
[Wang et al. (2019b)] KGAT: Knowledge Graph Attention Network for Recommendation. KDD 2019:
950-958
REFERENCES (2/3)
[Chen et al.] Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional
Network Approach. AAAI 2020: 27-34
[He et al.] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. SIGIR 2020:
639-648
[Wang et al. (2020b)] Disentangled Graph Collaborative Filtering. SIGIR 2020: 1001-1010
[Tao et al.] MGAT: Multimodal Graph Attention Network for Recommendation. Inf. Process. Manag. 57(5):
102277 (2020)
[Wu et al.] Self-supervised Graph Learning for Recommendation. SIGIR 2021: 726-735
[Yu et al.] Are Graph Augmentations Necessary?: Simple Graph Contrastive Learning for
Recommendation. SIGIR 2022: 1294-1303
[Mao et al.] UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation.
CIKM 2021: 1253-1262
[Peng et al.] SVD-GCN: A Simplified Graph Convolution Paradigm for Recommendation. CIKM 2022:
1625-1634
[Shen et al.] How Powerful is Graph Convolution for Recommendation? CIKM 2021: 1619-1629
[Sun et al.] HGCF: Hyperbolic Graph Convolution Networks for Collaborative Filtering. WWW 2021:
593-601
REFERENCES (3/3)
[Zhang et al.] Geometric Disentangled Collaborative Filtering. SIGIR 2022: 80-90
[Wei et al.] Dynamic Hypergraph Learning for Collaborative Filtering. CIKM 2022: 2108-2117
[Xia et al.] Hypergraph Contrastive Collaborative Filtering. SIGIR 2022: 70-79
[Wei et al.] Designing the Topology of Graph Neural Networks: A Novel Feature Fusion Perspective.
WWW 2022: 1381-1391
[Castellana and Errica] Investigating the Interplay between Features and Structures in Graph Learning.
CoRR abs/2308.09570 (2023)
[Shi et al.] Topology and Content Co-Alignment Graph Convolutional Learning. IEEE Trans. Neural
Networks Learn. Syst. 33(12): 7899-7907 (2022)
[Adomavicius and Zhang] Impactof data characteristics on recommender systems performance. ACM
Trans. Manag. Inf. Syst. 3(1): 3:1-3:17 (2012)
[Deldjoo et al.] How Dataset Characteristics Affect the Robustness of Collaborative Recommendation
Models. SIGIR 2020: 951-960
[Latapy et al.] Basic notions for the analysis of large two-mode networks. Soc. Networks 30(1): 31-48
(2008)
[M. E. J. Newman] Mixing patterns in networks. Phys. Rev. E, 67:026126, Feb 2003. doi: 10.
1103/PhysRevE.67.026126
CREDITS: This presentation template was created by Slidesgo, and
includes icons by Flaticon,and infographics & images by Freepik
Q&A Time
Scan me!

More Related Content

Similar to [IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collaborative Filtering

Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSwapnil Shrivastav
Β 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal clubHayaru SHOUNO
Β 
Geometric Deep Learning
Geometric Deep Learning Geometric Deep Learning
Geometric Deep Learning PetteriTeikariPhD
Β 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016ijcsbi
Β 
An efficient fuzzy classifier with feature selection based
An efficient fuzzy classifier with feature selection basedAn efficient fuzzy classifier with feature selection based
An efficient fuzzy classifier with feature selection basedssairayousaf
Β 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsNAVER Engineering
Β 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
Β 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
Β 
Application of normalized cross correlation to image registration
Application of normalized cross correlation to image registrationApplication of normalized cross correlation to image registration
Application of normalized cross correlation to image registrationeSAT Publishing House
Β 
Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structur...
Jing Ma - 2017 -  Detect Rumors in Microblog Posts Using Propagation Structur...Jing Ma - 2017 -  Detect Rumors in Microblog Posts Using Propagation Structur...
Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structur...Association for Computational Linguistics
Β 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...IRJET Journal
Β 
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...csandit
Β 
report2.doc
report2.docreport2.doc
report2.docbutest
Β 
VOLT - ESWC 2016
VOLT - ESWC 2016VOLT - ESWC 2016
VOLT - ESWC 2016Blake Regalia
Β 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram LudΓ€scher
Β 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Editor IJCATR
Β 
An Open Source Java Code For Visualizing Supply Chain Problems
An Open Source Java Code For Visualizing Supply Chain ProblemsAn Open Source Java Code For Visualizing Supply Chain Problems
An Open Source Java Code For Visualizing Supply Chain Problemsertekg
Β 
Kavitha soft computing
Kavitha soft computingKavitha soft computing
Kavitha soft computingIAEME Publication
Β 

Similar to [IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collaborative Filtering (20)

H010223640
H010223640H010223640
H010223640
Β 
Ijetcas14 347
Ijetcas14 347Ijetcas14 347
Ijetcas14 347
Β 
Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis Tools
Β 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
Β 
Geometric Deep Learning
Geometric Deep Learning Geometric Deep Learning
Geometric Deep Learning
Β 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
Β 
An efficient fuzzy classifier with feature selection based
An efficient fuzzy classifier with feature selection basedAn efficient fuzzy classifier with feature selection based
An efficient fuzzy classifier with feature selection based
Β 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
Β 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
Β 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
Β 
Application of normalized cross correlation to image registration
Application of normalized cross correlation to image registrationApplication of normalized cross correlation to image registration
Application of normalized cross correlation to image registration
Β 
Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structur...
Jing Ma - 2017 -  Detect Rumors in Microblog Posts Using Propagation Structur...Jing Ma - 2017 -  Detect Rumors in Microblog Posts Using Propagation Structur...
Jing Ma - 2017 - Detect Rumors in Microblog Posts Using Propagation Structur...
Β 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
Β 
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
Β 
report2.doc
report2.docreport2.doc
report2.doc
Β 
VOLT - ESWC 2016
VOLT - ESWC 2016VOLT - ESWC 2016
VOLT - ESWC 2016
Β 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Β 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Β 
An Open Source Java Code For Visualizing Supply Chain Problems
An Open Source Java Code For Visualizing Supply Chain ProblemsAn Open Source Java Code For Visualizing Supply Chain Problems
An Open Source Java Code For Visualizing Supply Chain Problems
Β 
Kavitha soft computing
Kavitha soft computingKavitha soft computing
Kavitha soft computing
Β 

Recently uploaded

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
Β 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
Β 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
Β 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
Β 
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...soniya singh
Β 
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”soniya singh
Β 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
Β 
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€F La
Β 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
Β 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
Β 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
Β 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
Β 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
Β 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
Β 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
Β 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
Β 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
Β 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
Β 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
Β 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
Β 

Recently uploaded (20)

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Β 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
Β 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Β 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Β 
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi πŸ”8264348440πŸ” Independent Escort...
Β 
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”
Call Girls in Defence Colony Delhi πŸ’―Call Us πŸ”8264348440πŸ”
Β 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
Β 
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€
εŠžη†(Vancouverζ―•δΈšθ―δΉ¦)εŠ ζ‹Ώε€§ζΈ©ε“₯εŽε²›ε€§ε­¦ζ―•δΈšθ―ζˆη»©ε•εŽŸη‰ˆδΈ€ζ―”δΈ€
Β 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Β 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
Β 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
Β 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Β 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
Β 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
Β 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
Β 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Β 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
Β 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Β 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
Β 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
Β 

[IRTalks@The University of Glasgow] A Topology-aware Analysis of Graph Collaborative Filtering

  • 1. A Topology-aware Analysis of Graph Collaborative Filtering Daniele Malitesta, PhD SisInfLab Research Group Politecnicodi Bari (Bari, Italy) The Universityof Glasgow (Glasgow, UK) 50 minutes February 05, 2024 (15.00 UK Time) IR Talks
  • 2. TABLE OF CONTENTS 01 05 03 06 04 07 INTRODUCTION & BACKGROUND TOPOLOGICAL CHARACTERISTICS TOPOLOGY IN GRAPH CF PROPOSED ANALYSIS RESULTS TAKE-HOME MESSAGES 02 MOTIVATIONS
  • 3. USEFUL RESOURCES The content of the following slides is taken from: β–  Daniele Malitesta, Claudio Pomo, Vito Walter Anelli, Alberto Carlo Maria Mancino, Eugenio Di Sciascio, Tommaso Di Noia: A Topology-aware Analysis of Graph Collaborative Filtering. CoRR abs/2308.10778 (2023) β–  Daniele Malitesta, Claudio Pomo, Tommaso Di Noia: Graph Neural Networks for Recommendation: Reproducibility, Graph Topology, and Node Representation. CoRR abs/2310.11270 (2023) Scan me!
  • 5. A RISING INTEREST 0 100 200 300 400 500 600 2020 2021 2022 2023 Papers As searched on DBLP through the keywords β€œgraph” and β€œrecommend”.
  • 6. PREVIOUS TUTORIALS ON GNNs-RECSYS
  • 7. A NON-EXHAUSTIVE TIMELINE Pioneer approaches proposing GCN-based aggregation methods [v an den Berg et al., Ying et al.] Explore inter-dependencies between nodes and their neighbours [Wang et al. (2019a)], use graph attention networks to recognize meaningful user-item interactions at higher-grained level [Wang et al. (2019b)] Lighten the graph convolutional layer [Chen e t al., He et al.], use graph attention networks to recognize meaningful user- item interactions at higher- grained level [Wang et al. (2020b), Tao et al.] Self-supervised and contrastive learning [Wu et al., Yu et al.] Simplify the message- passing formulation [Mao et al., Peng et al., Shen et al.] Explore other latent spaces [Shen et al., Sun e t al., Zhang et al.] Exploit hypergraphs [We i et al., Xia et al.] Use graph attention networks to recognize meaningful user- item interactions at finer- grained [Zhang et al.] level 2017-2018 2019 2020 2021-2022 2021-2022 2021-2022 2022 2022
  • 8. USER-ITEM GRAPH 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 1 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 user-item interaction matrix 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 adjacency matrix 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6
  • 9. USER-ITEM GRAPH 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 1 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 user-item interaction matrix 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 adjacency matrix 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 BIPARTITE
  • 10. USER-ITEM GRAPH 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 1 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 user-item interaction matrix 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 adjacency matrix 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑒1 𝑒2 𝑒3 𝑒4 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 UNDIRECTED
  • 11. MESSAGE-PASSING & LATENT-FACTORS MESSAGE-PASSING(LAYER1) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ (οΏ½) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 οΏ½ ( ) ( ) , = οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) FACTORIZATION & MESSAGE-PASSING 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 1 < > , β‰ˆ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) οΏ½ ( ) Message-passing component Latent-factors component
  • 13. Topologically WHY GRAPH CF WORKS WELL? β€’ The machine learning literature acknowledges that graph topology plays a crucial role in GNNs [Wei et al., Castellana and Errica]. β€’ For instance, revisit network topology to enhance the learning ability of GCNs [Shi et al.]. β€’ Established approaches such as LightGCN and DGCF adapt the GCN layer to suit the CF rationale. β€’ Recent graph-based RSs including UltraGCN and SVD-GCN go beyond the concept of message-passing. Algorithmically
  • 14. [Adomavicius andZhang, Deldjoo et al.] Classical dataset characteristics such as dataset sparsity may impact the performance of recommendation models in diverse scenarios and tasks.
  • 15. The topological nature of the user- item data and the new graph CF wave make it imperative to unveil the dependencies between topological properties and recommendation performance.
  • 16. 3) Build an explanatory model OUR CONTRIBUTIONS that studies the impact of classical and topological dataset characteristics on the performance of SOTA graph CF. which are recent and across-the-board in the literature. to calculate the linear dependences among characteristics and recommendation metrics. on their statistical significance, with varying settings of graph samplings. 2) Select 4 graph CF approaches 4) Validate the explanations 1) First analysis in the literature
  • 18. NODE DEGREE EXTENSION (1/3) DEFINITION Let 𝑁𝑒 (𝑙) and 𝑁𝑖 (𝑙) be the sets of neighborhood nodes for user 𝑒 and item 𝑖 at 𝑙 distance hops. The extended definition of node degrees for 𝑒 and 𝑖 is: πœŽπ‘’ = 𝑁𝑒 (1) , πœŽπ‘– = 𝑁𝑖 (1) . The average user and item node degrees on the whole user and item sets (π‘ˆ and 𝐼): πœŽπ‘ˆ = 1 π‘ˆ Οƒπ‘’βˆˆπ‘ˆ 𝑁𝑒 1 , 𝜎𝐼 = 1 𝐼 Οƒπ‘–βˆˆπΌ 𝑁𝑖 (1) .
  • 19. NODE DEGREE EXTENSION (2/3) RECSYS RE-INTERPRETATION The node degree in the user-item graph stands for the number of items (users) interacted by a user (item). This is related to the cold-start issue in recommendation, where cold users denote low activity on the platform, while cold items are niche products. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 πœŽπ‘’2 = 5 (active user) πœŽπ‘’7 = 1 (non-active user)
  • 20. NODE DEGREE EXTENSION (3/3) RECSYS RE-INTERPRETATION The node degree in the user-item graph stands for the number of items (users) interacted by a user (item). This is related to the cold-start issue in recommendation, where cold users denote low activity on the platform, while cold items are niche products. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 πœŽπ‘–7 = 4 (popular item) πœŽπ‘–4 = 1 (niche item)
  • 21. CLUSTERING COEFFICIENT (1/5) DEFINITION Let 𝑣 and 𝑀 be two nodes from the same partition (e.g., user nodes). Their similarity is the intersection over union of their neighborhoods. By evaluating the metric node-wise: 𝛾𝑣 = Οƒ π‘€βˆˆπ‘π‘£ 2 𝛾𝑣,𝑀 𝑁𝑣 2 with 𝛾𝑣,𝑀 = 𝑁𝑣 1 βˆ©π‘π‘€ 1 𝑁𝑣 1 βˆͺ𝑁𝑀 1 , where 𝑁𝑣 (2) is the second-order neighborhood set of 𝑣. The average clustering coefficient [Latapy et al.] on π‘ˆ and 𝐼 is: π›Ύπ‘ˆ = 1 π‘ˆ Οƒπ‘’βˆˆπ‘ˆ 𝛾𝑒, 𝛾𝐼 = 1 𝐼 Οƒπ‘–βˆˆπΌ 𝛾𝑖.
  • 22. CLUSTERING COEFFICIENT (2/5) RECSYS RE-INTERPRETATION High values of the clustering coefficient indicate that there exists a substantial number of co- occurrences among nodes of the same partition. For instance, user-wise, the value increases if several users share most of their interacted items. This intuition aligns with the rationale of collaborative filtering: two users are likely to share similar preferences if they interact with the same items. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝛾𝑒3,𝑒5 = 𝑖5,𝑖7 ∩ 𝑖5,𝑖6,𝑖7 𝑖5,𝑖6,𝑖7 = 2 3 = 0.667
  • 23. CLUSTERING COEFFICIENT (3/5) RECSYS RE-INTERPRETATION High values of the clustering coefficient indicate that there exists a substantial number of co- occurrences among nodes of the same partition. For instance, user-wise, the value increases if several users share most of their interacted items. This intuition aligns with the rationale of collaborative filtering: two users are likely to share similar preferences if they interact with the same items. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝛾𝑒6,𝑒7 = 𝑖7,𝑖8 ∩ 𝑖8 𝑖7,𝑖8 = 1 2 = 0.500
  • 24. CLUSTERING COEFFICIENT (4/5) RECSYS RE-INTERPRETATION High values of the clustering coefficient indicate that there exists a substantial number of co- occurrences among nodes of the same partition. For instance, user-wise, the value increases if several users share most of their interacted items. This intuition aligns with the rationale of collaborative filtering: two users are likely to share similar preferences if they interact with the same items. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝛾𝑒2,𝑒4 = 𝑖1,𝑖2,𝑖3,𝑖4,𝑖5 ∩ 𝑖3,𝑖7 𝑖1,𝑖2,𝑖3,𝑖4,𝑖5,𝑖7 = 1 6 = 0.167
  • 25. CLUSTERING COEFFICIENT (5/5) RECSYS RE-INTERPRETATION High values of the clustering coefficient indicate that there exists a substantial number of co- occurrences among nodes of the same partition. For instance, user-wise, the value increases if several users share most of their interacted items. This intuition aligns with the rationale of collaborative filtering: two users are likely to share similar preferences if they interact with the same items. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝛾𝑒1,𝑒5 = 𝑖1,𝑖2 ∩ 𝑖5,𝑖6,𝑖7 𝑖1,𝑖2,𝑖5,𝑖6,𝑖7 = 0 6 = 0.000
  • 26. DEGREE ASSORTATIVITY (1/3) DEFINITION Let 𝐷 = {𝑑1,𝑑2, …} be the set of unique node degrees in the graph, and let π‘’π‘‘β„Ž,π‘‘π‘˜ be the fraction of edges connecting nodes with degrees π‘‘β„Ž and π‘‘π‘˜. Then, let π‘žπ‘‘β„Ž be the probability distribution to choose a node with degree π‘‘β„Ž after having selected a node with the same degree (i.e., the excess degree distribution). The degree assortativity coefficient [M. E. J. Newman] is: 𝜌 = Οƒπ‘‘β„Žπ‘‘π‘˜ π‘‘β„Žπ‘‘π‘˜ π‘’π‘‘β„Ž,π‘‘π‘˜ βˆ’ π‘žπ‘‘β„Ž π‘žπ‘‘π‘˜ π‘ π‘‘π‘‘π‘ž 2 , where π‘ π‘‘π‘‘π‘ž is the standard deviation of the distribution π‘ž.
  • 27. DEGREE ASSORTATIVITY (2/3) RECSYS RE-INTERPRETATION The degree assortativity calculated user- and item-wise is a proxy to represent the tendency of users with the same activity level on the platform and items with the same popularity to gather, respectively. To visualize the degree assortativity, we need the projected user-user and item-item graphs. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 user-item graph user-user projected graph
  • 28. DEGREE ASSORTATIVITY (3/3) RECSYS RE-INTERPRETATION The degree assortativity calculated user- and item-wise is a proxy to represent the tendency of users with the same activity level on the platform and items with the same popularity to gather, respectively. To visualize the degree assortativity, we need the projected user-user and item-item graphs. 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝑖1 𝑖2 𝑖3 𝑖4 𝑖5 𝑖6 𝑖7 𝑖8 𝑒1 𝑒2 𝑒3 𝑒4 𝑒5 𝑒6 𝑒7 𝜌 = βˆ’0.191 (high degree dis-assortativity)
  • 29. Density CLASSICAL DATASET CHARACTERISTICS Estimates the number of all possible interactions: 𝜁 = π‘ˆπΌ. Defines the ratio between the number of users and items: πœ‹ = π‘ˆ 𝐼 . Measures the ratio of actual user-item interactions to all possible interactions: 𝛿 = 𝐸 π‘ˆπΌ . Calculates the interactions’ concentration for users (items): πœ…π‘ˆ = σ𝑒=1 π‘ˆβˆ’1 σ𝑣=𝑒+1 π‘ˆ π‘Žπ‘π‘  πœŽπ‘’βˆ’πœŽπ‘£ π‘ˆ σ𝑒=1 π‘ˆ πœŽπ‘’ . Shape Gini coefficient Space size
  • 31. UltraGCN (CIKM 2021) SELECTED GRAPH CF APPROACHES Node degree used to normalize the adjacency matrix in the message-passing. Node degree used to normalize the adjacency matrix in the message- passing. Node degree used for normalization in the infinite layer message-passing. The model also learns from the item- projected graph. Node embeddings involve the largest singular values of the normalized user-item interaction matrix, whose maximum value is related to the maximum node degree of the user- item graph. The model learns from user- and item-projected graphs. DGCF (SIGIR 2020) SVD-GCN (CIKM 2022) LightGCN (SIGIR 2020)
  • 32. The selected graph CF approaches explicitly utilize the node degree in the representation learning. However, clustering coefficient and degree assortativity do not have an evident representation in the formulations.
  • 33. β€’ Which topological aspects graph-based models can (un)intentionally capture? β€’ Are (topological) dataset characteristics influencing the recommendation performance of graph CF models?
  • 35. Our goal is to understand whether there exist dependencies among (topological) dataset characteristics and the performance of graph-based recommendation models. To this aim, we decide to build and fit an explanatory framework.
  • 38. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 Select characteristics Select models
  • 39. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 Select datasets Select characteristics Select models
  • 40. DATASETS # Users # Items # Interactions Yelp-2018 25,677 25,815 696,865 Amazon -Book 70,679 24,915 846,434 Gowalla 29,858 40,981 1,027,370
  • 41. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 Select characteristics Select models Select datasets
  • 42. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets Select characteristics Select models Select datasets
  • 49. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets Select characteristics Select models Select datasets
  • 50. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 Get datasets’ statistics Select characteristics Select models Select datasets
  • 53. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 Get datasets’ statistics Select characteristics Select models Select datasets
  • 54. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 STEP 6 Train models Select characteristics Select models Select datasets Get datasets’ statistics
  • 55. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 STEP 6 Train models STEP 7 Evaluate models Select characteristics Select models Select datasets Get datasets’ statistics
  • 56. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 STEP 6 Train models STEP 7 Evaluate models STEP 8 Generate explanations Select characteristics Select models Select datasets Get datasets’ statistics
  • 57. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘.
  • 58. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. Predicted performance
  • 59. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. Predicted performance Prediction error
  • 60. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. Predicted performance Prediction error Global bias
  • 61. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. Predicted performance Prediction error Global bias Weight of characteristic c
  • 62. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. Predicted performance Prediction error Global bias Weight of characteristic c Characteristic c for all samples
  • 63. GENERATE (LINEAR) EXPLANATIONS Characteristics Performance We aim to fit the following multivariate linear regression model: π’š = 𝝐 + πœƒ0 + πœƒπ‘π‘Ώπ‘. We use the ordinary least squares (OLS) optimization model: πœƒ0 βˆ—,πœ½π‘ βˆ— = min πœƒ0,πœƒπ‘ 1 2 π’š βˆ’ πœƒ0 βˆ’ πœ½π‘π‘Ώπ‘ 2 2 .
  • 64. EXPLANATORY FRAMEWORK: STEPS STEP 1 STEP 2 STEP 3 STEP 4 Generate sub-datasets STEP 5 STEP 6 Train models STEP 7 Evaluate models STEP 8 Generate explanations Select characteristics Select models Select datasets Get datasets’ statistics
  • 67. IMPACT OF CHARACTERISTICS How good is the linear regression at making predictions? 𝑅2 is, for many settings, above 95%. The regression model can provide good explanations.
  • 69. IMPACT OF CHARACTERISTICS inverse Graph CF = neighborhood + latent factors. Graph CF behaves as factorization-based models. direct
  • 71. IMPACT OF CHARACTERISTICS direct This confirms what observed in the models’ formulations. High users’ interactions correspond to better performance.
  • 72. IMPACT OF CHARACTERISTICS strong direct Values for users- and items-side are more flattened for UltraGCN and SVD-GCN.
  • 73. IMPACT OF CHARACTERISTICS LightGCN and DGCF have larger coefficients that SVD-GCN, because they explicitly propagate messages. direct inverse
  • 74. IMPACT OF CHARACTERISTICS UltraGCN has bigger coefficients, probably because it uses the infinite-layer message passing. direct inverse
  • 75. Probability distribution of node degrees on Gowalla
  • 76. In the worst-case scenario, node- dropout drops many high-degree nodes; edge-dropout, drops all the edges connected to several nodes and thus disconnect them from the graph.
  • 77. Is this undermining the goodness of the proposed explanatory framework?
  • 78. INFLUENCE OF NODE- AND EDGE-DROPOUT NODE 100% [Setting A] 100% node-dropout
  • 79. INFLUENCE OF NODE- AND EDGE-DROPOUT NODE EDGE 70% 30% [Setting B] 70% node-dropout 30% edge-dropout
  • 80. INFLUENCE OF NODE- AND EDGE-DROPOUT NODE EDGE 30% 70% [Setting C] 30% node-dropout 70% edge-dropout
  • 81. INFLUENCE OF NODE- AND EDGE-DROPOUT EDGE 100% [Setting D] 100% edge-dropout
  • 82. INFLUENCE OF NODE- AND EDGE-DROPOUT Datasets’ statistics Node-dropout retains smaller portions of the dataset than edge-dropout.
  • 83. INFLUENCE OF NODE- AND EDGE-DROPOUT Meaningful learned dependencies (this setting is like ours).
  • 84. INFLUENCE OF NODE- AND EDGE-DROPOUT Not-statistically-significant at the extremes.
  • 86. WHAT WE HAVE LEARNED 01 LATENT-FACTORS vs. NEIGHBORHOOD
  • 87. WHAT WE HAVE LEARNED 01 02 LATENT-FACTORS vs. NEIGHBORHOOD NODE DEGREE
  • 88. WHAT WE HAVE LEARNED 01 02 03 LATENT-FACTORS vs. NEIGHBORHOOD NODE DEGREE CLUSTERING COEFFICIENT
  • 89. WHAT WE HAVE LEARNED 01 04 02 03 LATENT-FACTORS vs. NEIGHBORHOOD NODE DEGREE CLUSTERING COEFFICIENT DEGREE ASSORTATIVITY
  • 90. WHAT WE HAVE LEARNED 01 04 02 05 03 LATENT-FACTORS vs. NEIGHBORHOOD NODE DEGREE CLUSTERING COEFFICIENT DEGREE ASSORTATIVITY NODE- vs. EDGE- DROPOUT
  • 91. REFERENCES (1/3) [Malitesta et al.] A Topology-aware Analysis of Graph Collaborative Filtering. CoRR abs/2308.10778 (2023) [Malitesta et al.] Graph Neural Networks for Recommendation: Reproducibility, Graph Topology, and Node Representation. CoRR abs/2310.11270 (2023) [Wang et al. (2020a)] Learning and Reasoning on Graph for Recommendation. WSDM 2020: 890-893 [El-Kishky et al.] Graph-based Representation Learning for Web-scale Recommender Systems. KDD 2022: 4784-4785 [Gao et al.] Graph Neural Networks for Recommender System. WSDM 2022: 1623-1625 [Purificato et al. (2023a)] Tutorial on User Profiling with Graph Neural Networks and Related Beyond- Accuracy Perspectives. UMAP 2023: 309-312 [Purificato et al. (2023b)] Leveraging Graph Neural Networks for User Profiling: Recent Advances and Open Challenges. CIKM 2023: 5216-5219 [van den Berg et al.] Graph Convolutional Matrix Completion. CoRR abs/1706.02263 (2017) [Ying et al.] Graph Convolutional Neural Networks for Web-Scale Recommender Systems. KDD 2018: 974-983 [Wang et al. (2019a)] Neural Graph Collaborative Filtering. SIGIR 2019: 165-174 [Wang et al. (2019b)] KGAT: Knowledge Graph Attention Network for Recommendation. KDD 2019: 950-958
  • 92. REFERENCES (2/3) [Chen et al.] Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach. AAAI 2020: 27-34 [He et al.] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. SIGIR 2020: 639-648 [Wang et al. (2020b)] Disentangled Graph Collaborative Filtering. SIGIR 2020: 1001-1010 [Tao et al.] MGAT: Multimodal Graph Attention Network for Recommendation. Inf. Process. Manag. 57(5): 102277 (2020) [Wu et al.] Self-supervised Graph Learning for Recommendation. SIGIR 2021: 726-735 [Yu et al.] Are Graph Augmentations Necessary?: Simple Graph Contrastive Learning for Recommendation. SIGIR 2022: 1294-1303 [Mao et al.] UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation. CIKM 2021: 1253-1262 [Peng et al.] SVD-GCN: A Simplified Graph Convolution Paradigm for Recommendation. CIKM 2022: 1625-1634 [Shen et al.] How Powerful is Graph Convolution for Recommendation? CIKM 2021: 1619-1629 [Sun et al.] HGCF: Hyperbolic Graph Convolution Networks for Collaborative Filtering. WWW 2021: 593-601
  • 93. REFERENCES (3/3) [Zhang et al.] Geometric Disentangled Collaborative Filtering. SIGIR 2022: 80-90 [Wei et al.] Dynamic Hypergraph Learning for Collaborative Filtering. CIKM 2022: 2108-2117 [Xia et al.] Hypergraph Contrastive Collaborative Filtering. SIGIR 2022: 70-79 [Wei et al.] Designing the Topology of Graph Neural Networks: A Novel Feature Fusion Perspective. WWW 2022: 1381-1391 [Castellana and Errica] Investigating the Interplay between Features and Structures in Graph Learning. CoRR abs/2308.09570 (2023) [Shi et al.] Topology and Content Co-Alignment Graph Convolutional Learning. IEEE Trans. Neural Networks Learn. Syst. 33(12): 7899-7907 (2022) [Adomavicius and Zhang] Impactof data characteristics on recommender systems performance. ACM Trans. Manag. Inf. Syst. 3(1): 3:1-3:17 (2012) [Deldjoo et al.] How Dataset Characteristics Affect the Robustness of Collaborative Recommendation Models. SIGIR 2020: 951-960 [Latapy et al.] Basic notions for the analysis of large two-mode networks. Soc. Networks 30(1): 31-48 (2008) [M. E. J. Newman] Mixing patterns in networks. Phys. Rev. E, 67:026126, Feb 2003. doi: 10. 1103/PhysRevE.67.026126
  • 94. CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon,and infographics & images by Freepik Q&A Time Scan me!