NS-CUK Seminar:H.B.Kim, Review on "Asymmetric transitivity preserving graph embedding", KDD 2016

Ho-Beom Kim
Network Science Lab
Dept. of Mathematics
The Catholic University of Korea
E-mail: hobeom2001@catholic.ac.kr
2023 / 06 / 26
OU, Mingdong, et al.

2
 Introduction
• Contributions
 Methodology
 Related work
 Experiments
 Results
 Discussion
 Conclusion

3
1. Introduction
Problem Statements
• Nowadays, more and more applications are based on larger and larger networks and graphs.
• Most of the existing graph embedding methods target on undirected graphs
• We can not straightforwardly apply undirected graph embedding methods on directed grpahs.

4
1. Introduction
Contributions
• Propose a high-order proximity preserved embedding (HOPE) method to solve the challenging
problem of asymmetric transitivity in directed graph embedding
• Derive a general form covering multiple commonly used high-order proximities, enabling the scalable
solution of HOPE with generalized SVD
• Provide an upper bound on the approximation error of HOPE
• Extensive experiments are conducted to verify the usefulness and generality of the learned embedding
in various applications

5
1. Introduction
Framework of Asymmetric Transitivity Preserving Graph Embedding

6
2. Related Work
Graph Embedding
• Propose a high-order proximity preserved embedding (HOPE) method to solve the challenging
problem of asymmetric transitivity in directed graph embedding
• Derive a general form covering multiple commonly used high-order proximities, enabling the scalable
solution of HOPE with generalized SVD
• Provide an upper bound on the approximation error of HOPE
• Extensive experiments are conducted to verify the usefulness and generality of the learned embedding
in various applications

7
3. High-Order Proximity Preserved Embedding
Notations
• 𝐴 ∶ 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑐𝑦 𝑚𝑎𝑡𝑟𝑖𝑥
• 𝑠𝑖𝑗 ∶ 𝑝𝑟𝑜𝑥𝑖𝑚𝑖𝑡𝑦 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑣𝑖, 𝑣𝑗
• 𝑈 = 𝑈𝑠, 𝑈𝑡 : 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑚𝑎𝑡𝑟𝑖𝑥, 𝑈𝑠 ∶ 𝑠𝑜𝑢𝑟𝑐𝑒 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑣𝑒𝑐𝑡𝑜𝑟𝑠, 𝑈𝑡: 𝑡𝑎𝑟𝑔𝑒𝑡 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑣𝑒𝑐𝑡𝑜𝑟𝑠
• 𝐾 ∶ 𝑒𝑚𝑏𝑒𝑑𝑑𝑖𝑛𝑔 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑠

8
Problem Definition
• m𝑖𝑛 ∥ 𝑆 − 𝑈𝑠
∙ 𝑈𝑡𝑇
∥𝐹
2
: loss function

9
High order proximities
• 𝑆 = 𝑀𝑔
−1
∙ 𝑀𝑙
• Katz Index : An ensemble of all paths, which is a weighted summation over the path set between two
vertexes
• Weight of a path : exponential function of its length
• 𝑆𝐾𝑎𝑡𝑧
= 𝑙=1
∞
𝛽 ∙ 𝐴𝑙
= 𝛽 ∙ 𝐴 ∙ 𝑆𝐾𝑎𝑡𝑧
+ 𝛽 ∙ 𝐴
• 𝛽 : decay parameter
= (𝐼 − 𝛽 ∙ A)
−1
∙ 𝛽 ∙A
• 𝑀𝑔 = 𝐼 − 𝛽 ∙ A
• 𝑀𝑙 = 𝛽 ∙ A

10
• 𝑆 = 𝑀𝑔
−1
∙ 𝑀𝑙
• 𝑀𝑔, 𝑀𝑙 : polynomial matrices
• Katz Index : An ensemble of all paths, which is a weighted summation over the path set between two
vertexes
• Weight of a path : exponential function of its length
• 𝑆𝐾𝑎𝑡𝑧 = 𝑙=1
∞
𝛽 ∙ 𝐴𝑙 = 𝛽 ∙ 𝐴 ∙ 𝑆𝐾𝑎𝑡𝑧 + 𝛽 ∙ 𝐴
• 𝛽 : decay parameter
= (𝐼 − 𝛽 ∙ A)
−1
∙ 𝛽 ∙A
• 𝑀𝑔 = 𝐼 − 𝛽 ∙ A
• 𝑀𝑙 = 𝛽 ∙ A

11
• Rooted PageRank (RPR)
• 𝑆𝑖𝑗
𝑅𝑃𝑅
∶ 𝑡ℎ𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑡ℎ𝑎𝑡 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚 𝑤𝑎𝑙𝑘 𝑓𝑟𝑜𝑚 𝑛𝑜𝑑𝑒 𝑣𝑖 𝑤𝑖𝑙𝑙 𝑙𝑜𝑐𝑎𝑡𝑒 𝑎𝑡 𝑣𝑗𝑖𝑛 𝑡ℎ𝑒 𝑠𝑡𝑒𝑎𝑑𝑦 𝑠𝑡𝑎𝑡𝑒
• 𝛼 : jumps to one of the neighbor of current node with probability , jumps back probability
• 𝑃 ∶ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑡𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛 𝑚𝑎𝑡𝑟𝑖𝑥 𝑠𝑎𝑡𝑖𝑠𝑓𝑦𝑖𝑛𝑔 𝑖=1
𝑁
𝑃𝑖𝑗 = 1
• 𝑆𝑖𝑗
𝑅𝑃𝑅
= 1 − 𝛼 ∙ 𝐼 − 𝛼𝑃 −1
• 𝑀𝑔 = 𝐼 − 𝛼𝑃
• 𝑀𝑙 = (1 − 𝛼) ∙ I

12
• Common Neighbors (CN)
• 𝑆𝑖𝑗
𝐶𝑁
∶ 𝐶𝑜𝑢𝑛𝑡𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑏𝑒𝑟𝑡𝑒𝑥𝑒𝑠 𝑐𝑜𝑛𝑛𝑒𝑐𝑡𝑖𝑛𝑔 𝑡𝑜 𝑏𝑜𝑡ℎ 𝑣𝑖𝑎𝑛𝑑 𝑣𝑗
: The number of vertexes which is target of an edge from 𝑣𝑖 and the source of an edge to 𝑣𝑗
• 𝑆𝐶𝑁
= 𝐴2
• 𝑀𝑔 = 𝐼
• 𝑀𝑙 = 𝐴2

13
• Adamic-Adar (AA)
• AA is a variant of common neighbors.
• AA assigns each neighbor a weight, that is the reciprocal of the degree of the neighbor
• The more vertexes one vertex connected to, the less important it is on evaluating the proximity
between a pair of vertex
• 𝑆𝐴𝐴
= 𝐴 ∙D ∙A
• D : diagonal matrix
• 𝐷𝑖𝑖 = 1/ 𝑗(𝐴𝑖𝑗 + 𝐴𝑗𝑖)
• 𝑀𝑔 = 𝐼
• 𝑀𝑙 = 𝐴 ∙ D ∙ A

14
Approximation of High-Order Proximity
• Objective Equation aims to find an optimal rank-K approximation of the proximity matrix S
• 𝑠 = 𝑖=1
𝑁
𝜎𝑖𝑉𝑖
𝑆
𝑉𝑖
𝑡𝑇
• Make the SVD on S very expensive, the solution is not feasible for large scale graphs

15
Approximation of High-Order Proximity
• Adopt a partial generalized SVD algorithm (JDGSVD) : an iterative Jacobi-Davidson type algorithm which is
very scalable when k<< N

16
Approximation Error
• {𝜎𝑖} 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑠𝑖𝑛𝑔𝑢𝑙𝑎𝑟 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑆 𝑖𝑛 𝑑𝑒𝑠𝑐𝑒𝑛𝑑 𝑜𝑟𝑑𝑒𝑟
• The lower the rank of S is, the smaller the error is

17
4. Experiments
Datasets
• Synthetic Data (Syn)
• Generate the synthetic data by the forest fire model
• Can generate powerlaw graphs
• As the time complexity of computation of some high-order proximities (e.g. Katz, RPR) is too
high, we generate small-scaled synthetic data to allow for the computation of high- order
proximities
• Randomly generate the synthetic datasets
• Cora
• Citation network of academic papers
• Vertexes : academic papers, Directed edges : the citation relationship between papers

18
4. Experiments
Datasets
• Twitter Social Network (SN-Twitter)
• A subnetwork of Twitter
• Vertexes : users of Twitter, Directed edges : following relationships between users
• Tencent Weibo Social Network (SN-Tweibo)
• Contains a subnetwork of the social network in Tencent Weibo, a Twitter-style social network in
China
• Vertexes : Users, Directed edges : following relationship between users

19
4. Experiments
Baseline Methods
• LINE : Preserves the first-order and second-order proximity between vertexes.
• DeepWalk : Randomly walks on the graph, and assumes a pair of vertexes similar if they are close in
the random path. Then, the embedding is learned to preserve these pairwise similarities in the
embedding
• PPE(Partial Proximity Embedding) : First selects a small subset of vertexes as landmarks, and learns
the embedding by approximating the proximity between vertexes and landmarks
• Common Neighbors : Rank the links by the number of common neighbors. Use it for link prediction
and vertex recommendation
• Adamic-Adar : Rank the links by the Adamic-Adar values Use it for link prediction and vertex
recommendation

20
4. Experiments
Baseline Methods

21
4. Experiments
Evaluation Metrics

22
4. Experiments
High-order Proximity Approximation

23
4. Experiments
Graph Reconstruction

24
4. Experiments

25
4. Experiments

26
4. Experiments
Conclusion
• we aim to preserve asymmetric transitivity in directed graphs, and propose to preserve asymmetric
transitivity by approximating high-order proximities.
• We propose a scalable approximation algorithm , called High- Order Proximity preserved Embedding
(HOPE).
• we first derive a general formulation of a class of high-order proximity measurements, then apply
generalized SVD to the general formulation, whose time complexity is linear with the size of graph.
• The empirical study demon- strates the superiority of asymmetric transitivity and our proposed
algorithm, HOPE.
• Our future direction is to de- velop a nonlinear model to better capture the complex struc- ture of
directed graphs.

NS-CUK Seminar:H.B.Kim, Review on "Asymmetric transitivity preserving graph embedding", KDD 2016

Recommended

Recommended

More Related Content

Similar to NS-CUK Seminar:H.B.Kim, Review on "Asymmetric transitivity preserving graph embedding", KDD 2016

Similar to NS-CUK Seminar:H.B.Kim, Review on "Asymmetric transitivity preserving graph embedding", KDD 2016 (20)

More from ssuser4b1f48

More from ssuser4b1f48 (20)

Recently uploaded

Recently uploaded (20)

NS-CUK Seminar:H.B.Kim, Review on "Asymmetric transitivity preserving graph embedding", KDD 2016

Editor's Notes