1. Jin-Woo Jeong
Network Science Lab
Dept. of Mathematics
The Catholic University of Korea
E-mail: zeus0208b@catholic.ac.kr
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, Wenwu Zhu
2. 1
INTRODUCTION
• Motivation
• Introduction
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
• Problem Definition
• Notation
• High order proximities
• Approximation of High-Order Proximity
EXPERIMENTS
• Experiments Setting
• High-Order Proximity Approximation
• Graph Reconstruction
• Link Prediction
• Vertex Recommendation
CONCLUSION
Q/A
3. 2
INTRODUCTION
Motivation
Most of the existing graph embedding methods target on undirected graphs.
We can not apply undirected graph embedding methods on directed graph because a fundamentally
different characteristic of directed graphs : asymmetric transitivity.
How to preserve the asymmetric transitivity of directed graphs in a vector space is much more challenging.
4. 3
INTRODUCTION
Introduction
In this paper, we tackle the challenging problem of asymmetric transitivity preserving graph embedding.
Our major idea is that we learn two embedding vectors, source vector and target vector, for each node to
capture asymmetric edges, as illustrated in Figure 2.
• We propose a high-order proximity preserved
embedding (HOPE) method
• We derive a general form covering multiple
commonly used high-order proximities, enabling
the scalable solution of HOPE with generalized SVD.
• We provide an upper bound on the approximation
error of HOPE.
• Extensive experiments are conducted to verify the
use- fulness and generality of the learned
embedding in var- ious applications.
5. 4
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
Notations
• 𝐺 = 𝑉, 𝐸
• 𝑉 = 𝑣1, ⋯ , 𝑣𝑖, ⋯ , 𝑣𝑁 𝑤ℎ𝑒𝑟𝑒 𝑁 𝑖𝑠 𝑜𝑓 𝑣𝑒𝑟𝑡𝑒𝑥𝑒𝑠
• 𝐸 is the directed edge set. 𝑒𝑖𝑗 = 𝑣𝑖, 𝑣𝑗 ∈ 𝐸 represents a directed edge from 𝑣𝑖 to 𝑣𝑗.
• 𝐴 is adjacency matrix
• 𝑆 is a high-order proximity matrix, 𝑤ℎ𝑒𝑟𝑒 𝑆𝑖𝑗 is the proximity between 𝑣𝑖and 𝑣𝑗
• 𝑈 = 𝑈𝑠, 𝑈𝑡 is embedding matrix, 𝑤ℎ𝑒𝑟𝑒 the 𝑖-th row, 𝑢𝑖, is the embedding vector of 𝑣𝑖
• 𝑈𝑠
, 𝑈𝑡
∈ ℛ𝑁×𝐾
are the source embedding vectors and target embedding vectors respectively, 𝑤ℎ𝑒𝑟𝑒 𝐾 is the
embedding dimensions.
6. 5
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
Problem Definition
As high-order proximities are derived from asymmetric transitivity, we propose to preserve the asymmetric
transitivity by approximating high-order proximity. Formally, we adopt the L2-norm below as the loss
function which need to be minimized:
7. 6
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
High order proximities
Many high-order proximity measurements in graph can reflect the asymmetric transitivity. Moreover, we
found that many of them share a general formulation which will facilitate the approximation of these
proximities, that is:
Global proximities
Local proximities
9. 8
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
High order proximities
Rooted PageRank (RPR)
• 𝛼 ∈ [0, 1) : probability to randomly walk to a neighbor
• 𝑃 : probability transition matrix satisfying that 𝑖=1
𝑁
𝑃𝑖𝑗 = 1.
12. 11
HIGH-ORDER PROXIMITY PRESERVED EMBEDDING
Approximation of High-Order Proximity
The objective in Equation (1) aims to find an optimal rank-K approximation of the proximity matrix 𝑆.
the solution is to perform SVD (Singular Value Decomposition) on S and use the largest K singular value
and corresponding singular vectors to construct the optimal embedding vectors.
The solution is not feasible for large scale graphs. (because of time complexity)
15. 14
Experiments Setting
EXPERIMENTS
Datasets
• Synthetic Data (Syn) : generated small data.
• Cora : a citation network of academic papers.
Vertexes : academic papers
Directed Edges : the citation relationship between papers
• Twitter Social Network (SN-Twitter) : the subnetwork of Twitter.
Vertexes :users of Twitter
Directed Edges : following relationship between users
• Tencent Weibo Social Network (SN-TWeibo) : the subnetwork of social network in Tencent Weibo.
Vertexes :users
Directed Edges : following relationship between users
Small Large
17. 16
Experiments Setting
EXPERIMENTS
Evaluation Metrics
• RMSE : used to evaluate the approximation error of the proximity approximation algorithms.
• NRMSE(Normalized RMSE) : used to evaluate the relative error of the proximity approximation algorithms.
• Precision@k : used to evaluate the performance of link prediction
• MAP : used to evaluate the performance of vertex recommendation
21. 20
Conclusion
Conclusion
We propose a scalable approximation algorithm , called High-Order Proximity preserved Embedding
(HOPE). In this algorithm, we first derive a general formulation of a class of high-order proximity
measurements, then apply generalized SVD to the general formulation, whose time complexity is linear
with the size of graph.
The empirical study demonstrates the superiority of asymmetric transitivity and our proposed algorithm,
HOPE.