Do Graph Neural Networks Build Fair User Models? Assessing Disparate Impact and Mistreatment in Behavioural User Profiling

Do Graph Neural Networks
Build Fair User Models?
Assessing Disparate Impact and Mistreatment
in Behavioural User Profiling
Erasmo Purificato, Ludovico Boratto, and Ernesto William De Luca
DEFirst Reading Group
23rd February 2023

2011-2017
Master Degree in
Computer Engineering
University of Naples Federico II
2020-Today
Research Fellow
Leibniz Institute for Educational Media |
Georg Eckert Institute, Brunswick
2018-2019
2nd level Master Degree in
Artificial Intelligence
University of Turin
2021-Today
Adjunct Professor in
Software Engineering
University of Rome Guglielmo Marconi
About me
2017-2019
Technical Leader of
Cognitive Business Unit
Blue Reply, Turin
2020-Today
Research Assistant and PhD Student on
“ML Techniques for User Modeling
and User-Adaptive Interaction”
Otto von Guericke University Magdeburg

“
Different user profiling paradigms
in Graph Neural Network models
affect fairness results.

Background
User profiling
Infer an individual’s interests, personality traits
or behaviours from generated data to create an
efficient representation, i.e. a user model.

Background
Explicit user profiling
Early profiling approaches considered only the
analysis of static characteristics, with data
often coming from online forms and surveys.
Implicit user profiling
Modern systems focus on profiling users’ data
based on individuals’ actions and interactions
(also known as behavioural user profiling).

Background
Graphs are a natural way to model user behaviours.
Graph Neural Networks (GNNs) represent the perfect solution.
GNN

Background
Existing approaches evaluate
user profiling models as a
classification task.

Motivation
As any ML system trained on historical data, GNNs are prone to learn
biases in such data and reveal them in their output.

◎ Performed two user profiling tasks by executing a binary classification on
two real-world datasets with the most performing GNNs in this context;
◎ Assessed disparate impact and disparate mistreatment for GNNs designed
for behavioural user profiling, considering four algorithmic fairness metrics;
◎ From an extensive set of experiments, derived three observations about the
analysed models, correlating their different user profiling paradigms with the
fairness metrics scores to create a baseline for future assessment considering
GNN-based models for user profiling.
Our contributions

◎ RQ1: How do the different input types of the analysed GNNs and the way the
user models are constructed affect fairness?
◎ RQ2: To what extent can the user models produced by the analysed state-of-
the-art GNNs be defined as fair?
◎ RQ3: Are disparate impact metrics enough to assess the fairness of GNN-based
models in behavioural user profiling tasks or is disparate mistreatment needed
to fully assess the presence of unfairness?
Research questions

Analysed models
CatGCN
Weijian Chen, Fuli Feng, Qifan Wang, Xiangnan He, Chonggang Song, Guohui Ling, and Yongdong Zhang. CatGCN: Graph
Convolutional Networks with Categorical Node Features. IEEE Trans on Knowledge and Data Engineering (2021)

Analysed models
RHGN
Qilong Yan, Yufeng Zhang, Qiang Liu, Shu Wu, and Liang Wang. Relation-aware Heterogeneous Graph for User Profiling.
In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM’21).

◎ Statistical Parity (SP)
◎ Equal opportunity (EO)
Algorithmic fairness metrics adopted
Disparate impact
Refers to a form of indirect and often unintentional discrimination that occurs
when practices or systems seem to apparently treat people the same way.
∆𝑆𝑃= |𝑃 𝑦 = 1 𝑠 = 0 − 𝑃 𝑦 = 1 𝑠 = 1 |
∆𝐸𝑂= |𝑃 𝑦 = 1 𝑦 = 1, 𝑠 = 0 − 𝑃 𝑦 = 1 𝑦 = 1, 𝑠 = 1 |
◎ Overall accuracy equality (OAE)
∆𝑂𝐴𝐸 = |𝑃 𝑦 = 0 𝑦 = 0, 𝑠 = 0 + 𝑃 𝑦 = 1 𝑦 = 1, 𝑠 = 0 −
𝑃 𝑦 = 0 𝑦 = 0, 𝑠 = 1 − 𝑃 𝑦 = 1 𝑦 = 1, 𝑠 = 1 |

◎ Treatment equality (TE)
Algorithmic fairness metrics adopted
Disparate mistreatment
Considers the misclassification rates for user groups having different values of
the sensitive attribute, instead of considering the corrected predictions.
∆𝑇𝐸=
𝑃 𝑦 = 1 𝑦 = 0, 𝑠 = 0
𝑃 𝑦 = 0 𝑦 = 1, 𝑠 = 0
−
𝑃 𝑦 = 1 𝑦 = 0, 𝑠 = 1
𝑃 𝑦 = 0 𝑦 = 1, 𝑠 = 1

Datasets
Alibaba contains click-through rates
data about ads displayed on Alibaba’s
Taobao platform, and has been adopted
in both CatGCN and RHGN original
papers for evaluation.
JD consists of users and items from the
retailer company of the same name
having click and purchase relationships,
already used in RHGN original paper.

◎ For RQ1: Considered disparate impact metrics and compared their scores
between the two GNNs to measure models’ discrimination level and evaluate
the impact of the different user profiling paradigms on fairness.
◎ For RQ2: Contextualised the values of SP and EO with the results of FairGNN, a
recent framework focusing on predicting fair results for user profiling.
◎ For RQ3: Extended our fairness evaluation to consider TE and determine to
what extent the GNN models discriminate against users from the perspective of
disparate mistreatment, in comparison to disparate impact.
Experimental setting

Results and discussion
Observation 1. The ability of RHGN to
represent users through multiple
interaction modelling gains better
values in terms of fairness than a model
only relying on binary associations
between users and items, as CatGCN.
Observation 2. Even though RHGN
demonstrates to be a fairer model
than CatGCN, a debiasing process
is equally needed for both GNNs.

Results and discussion
Observation 3. In scenarios where the
correctness of a decision on the target label
w.r.t. the sensitive attributes is not well
defined, or where there is a high cost for
misclassified instances, a complete fairness
assessment should always take into account
disparate mistreatment evaluation.

◎ Conducted, for the first time in literature, a fairness assessment for GNN-based
behavioural user profiling models.
◎ Our analysis on two state-of-the-art models and real-world datasets, covering
four algorithmic fairness metrics, showed that directly modelling raw user
interactions with a platform hurts a demographic group, who gets misclassified
more than its counterpart.
◎ In future work, we will explore other aspects of fairness analysis in this context,
e.g. multi-class and multi-group perspectives, as well as introduce some novel
procedures for bias mitigation.
Conclusion and future directions

Thanks!
I am Erasmo Purificato
You can find me at:
https://erasmopurif.com
@erasmopurif11
erasmo.purificato@ovgu.de

Do Graph Neural Networks Build Fair User Models? Assessing Disparate Impact and Mistreatment in Behavioural User Profiling

Recommended

Recommended

More Related Content

Similar to Do Graph Neural Networks Build Fair User Models? Assessing Disparate Impact and Mistreatment in Behavioural User Profiling

Similar to Do Graph Neural Networks Build Fair User Models? Assessing Disparate Impact and Mistreatment in Behavioural User Profiling (20)

Recently uploaded

Recently uploaded (20)

Do Graph Neural Networks Build Fair User Models? Assessing Disparate Impact and Mistreatment in Behavioural User Profiling

Editor's Notes