This document summarizes and compares different techniques for thresholding weighted networks to uncover core structures. It discusses global thresholding, dyadic thresholding, and ego-centric thresholding. Results show that ego-centric thresholding eliminated the most ties at significance levels and made the most sense for the dataset of website browsing data. Global thresholding performed the weakest. The appropriate choice between dyadic and ego-centric thresholding depends on which null model fits the particular context best.
Information science research with large language models: between science and ...
What Counts As A Weak Tie?
1. WHAT COUNTS AS A
WEAK TIE?Subhayan Mukerjee and Sandra Gonzalez-Bailon
68th Conference of International Communication Association,
Prague, 2018
A Comparison of
Thresholding Techniques to
Analyze Weighted Networks
3. TYPES OF
THRESHOLDING
- Normalize weights to [0,1]
- Set a threshold τ ∈ [0,1]
- Remove edges with norm.
weight <= τ
- Progressively increase τ
Global Thresholding
(Borge-Holthoefer and González-Bailón,
2017)
eg. De Choudhury, Mason, Hofman, & Watts,
2010
Eguíluz, Chialvo, Cecchi, Baliki, & Apkarian,
2005
Allesina, Bodini, & Bondavalli, 2006
Dyadic Thresholding
(Ronen, Gonçalves, Hu, Vespignani, Pinker &
Hidalgo, 2014)
- Is a tie between two nodes
stronger than what is expected
by chance?
𝛷𝑖𝑗 =
𝐷𝑖𝑗 𝑁 − 𝐴𝑖𝐴𝑗
𝐴𝑖 𝐴𝑗(𝑁 − 𝐴𝑖)(𝑁 − 𝐴𝑗)
𝑡 =
𝛷𝑖𝑗 max(𝐴𝑖, 𝐴𝑗) − 2
1 − 𝛷𝑖𝑗
2
eg. Majó-Vázquez, Cardenal, & González-Bailón
(2017)
Mukerjee, Majo-Vázquez, & Gonzalez-Bailon,
(2018)
Ego centric thresholding
(disparity filter)
(Serrano, Boguna, & Vespignani, 2009)
- Eliminates edges without destroying the
multi-scale nature of real world
networks. Compare normalized weights
to a simulated network built by
randomly assigning normalized weights
from a uniform distribution
𝑠𝑖 =
𝑖
𝑤𝑖𝑗
𝑝𝑖𝑗 = 𝑤𝑖𝑗 𝑠𝑖
𝛼𝑖𝑗 = 1 − 𝑝𝑖𝑗
𝑘−1
eg. Ahn, Ahnert, Bagrow, & Barabási, 2011
Del Vicario, Zollo, Caldarelli, Scala, &
Quattrociocchi, 2017 Olson & Neal, 2015
4. DATA
- Obtained from comScore
- Website browsing data in the US and the UK
- Two important statistics
- Audience reach (number of users visiting each website)
- Cross-visiting (number of users visiting each pair of sites)
7. CONCLUSIONS
1. The weight distribution greatly affects the results
2. The ego-centric thresholding eliminates the largest number of ties
for a significance level
3. In the context of this dataset, the ego centric thresholding makes
the most sense (local disparities are high owing to skew of degree
and heterogeneity of reach)
4. Global thresholding is the weakest of the options
5. Finally, between dyadic thresholding vs ego centric thresholding –
which null models makes more sense in a particular context –
determines which can work better.
Global thresholding is a very simple and intuitive mechanism – widespread application (network models of the human brain (Eguíluz, Chialvo, Cecchi, Baliki, & Apkarian, 2005) , to ecological networks (Allesina, Bodini,& Bondavalli, 2006) . but it has a few disadvantages. It penalizes the least connected least connected nodes and ones with the weakest connections. It ignores the local structural features of the network.
Dyadic thresholding used in global language networks and networks of information consumption online.
Ego centric thresholding: In the null model, the normalized weights of a certain node with degree k is generated like this: k − 1 pins are randomly assigned between the interval 0 and 1. The interval is divided into k subintervals. The length of the subinterval represents the normalized weight of each link in the null model. Null model: for a given normalized weight, pij the p value of alpha ij of pij based on the null model
Alphaij = (1- pik)^(k-1). The meaning of alphaij is the probability of having normalized weights larger or equal to pij in the framework of the given null model. By setting a significance level alpha between 0 and 1, for any edge of normalized weight pij is alphaij is larger than alpha, it will be filtered out. Used in analysis of food flavor networks. Mapping social dynamics on Facebook. And mapping user interests in social media.