Link Prediction in (Partially)
Aligned Heterogeneous Social
Networks
By: Sina Sajadmanesh
Advisor: Dr. Hamid Reza Rabiee
2 DMLLink Prediction DMLDMLLink Prediction2
Outline
Introduction
 Problem Formulation
 Applications
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
3 DMLLink Prediction DMLDMLLink Prediction3
Introduction
Problem
 Based on a snapshot of
network, predicting the set
of potential links to be
formed in the future is
formally defined as Link
Prediction Problem.
 First proposed by Liben-
Novel and Klienberg in
CIKM 2003
4 DMLLink Prediction DML4
Applications
Social networks and E-commerce
 Recommender systems
 Friend recommendation
 Bioinformatics
 Prioritization of candidate disease genes
 Drug discovery
 Security
 Identify missing links between criminals
 Controlling computer viruses
5 DMLLink Prediction DMLDMLLink Prediction5
Outline
Introduction
Link Prediction in Homogeneous Networks
 Unsupervised methods
 Supervised methods
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
6 DMLLink Prediction DMLDMLLink Prediction6
Link Prediction in Homogeneous Networks
Homogeneous Network
𝐺 = (𝑉, 𝐸)
 If V contains one single type nodes and E
contains one single type of links, then G is
a homogeneous network
7 DMLLink Prediction DMLDMLLink Prediction7
Link Prediction in Homogeneous Networks
Unsupervised methods
 Measuring the closeness among nodes
 Assuming that close nodes are more likely
to be connected
Unsupervised Link Predicators
 Local neighbor based predicators
 Path based predicators
 Random-Walk based predicators
8 DMLLink Prediction DMLDMLLink Prediction8
Unsupervised Methods
Local Neighbor based Link Predicators
 Preferential Attachment
𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
9 DMLLink Prediction DMLDMLLink Prediction9
Unsupervised Methods
Local Neighbor based Link Predicators
 Common Neighbor
𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
10 DMLLink Prediction DMLDMLLink Prediction10
Unsupervised Methods
Local Neighbor based Link Predicators
 Jaccard Coefficient
𝐽𝐶 𝑢, 𝑣 =
Γ(𝑢) ∩ Γ(𝑣)
Γ(𝑢) ∪ Γ(𝑣)
 Adamic/Adar
𝐴𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
log( Γ(𝑤) )
 Resource Allocation
𝑅𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
Γ(𝑤)
11 DMLLink Prediction DMLDMLLink Prediction11
Unsupervised Methods
Path based Link Predicators
 Shortest Path
𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }
 Katz Score
𝐾𝑎𝑡𝑧 𝑢, 𝑣 =
𝑙=1
∞
𝛽 𝑙
𝑃𝑢↝𝑣
𝑙
= (𝐼 − 𝛽𝐴)−1−𝐼
12 DMLLink Prediction DMLDMLLink Prediction12
Unsupervised Methods
Random-Walk based Link Predicators
 Hitting Time
𝐻𝑇 𝑢, 𝑣 = 1 +
𝑤∈Γ(𝑢)
𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)
 Commute Time
𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
13 DMLLink Prediction DMLDMLLink Prediction13
Link Prediction in Homogeneous Networks
Other Unsupervised Methods
 Matrix Factorization methods
 Maximum Likelihood Methods
 Probabilistic Methods
14 DMLLink Prediction DMLDMLLink Prediction14
Link Prediction in Homogeneous Networks
Supervised Link Prediction
 Learning a binary classifier that will
predict whether a link exists between a
given pair of nodes or not.
 First proposed by Hassan et. al in SDM
2006
15 DMLLink Prediction DMLDMLLink Prediction15
Supervised Link Prediction
Dataset
 We need two snapshot of the network for
training
Positive Samples
 Links that are missing in former snapshot,
but are formed in latter snapshot
Negative Samples
 Links that are missing in both snapshots
16 DMLLink Prediction DMLDMLLink Prediction16
Supervised Link Prediction
Features
 Topological Features
 Attribute-based Features
Classifiers
 SVM
 Decision Trees
 Multilayer Perceptron
 KNN
 Naïve Bayes
17 DMLLink Prediction DMLDMLLink Prediction17
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
 Relationship Prediction
 Collective Link Prediction
Link Prediction in Aligned Networks
Future Works
18 DMLLink Prediction DMLDMLLink Prediction18
Link Prediction in Heterogeneous Networks
Heterogeneous Network
𝐺 = (𝑉, 𝐸)
 𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of
nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G
 𝐸 = 𝑗 𝐸𝑗 is the sets of various types of
links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
19 DMLLink Prediction DMLDMLLink Prediction19
Link Prediction in Heterogeneous Networks
Heterogeneous Network Schema
20 DMLLink Prediction DMLDMLLink Prediction20
Link Prediction in Heterogeneous Networks
From Link Prediction to Relationship Prediction
 A relationship between two objects could be
a composition of two or more links
• E.g. two authors have a co-author relationship if
they have co-written a paper
 Need to redesign topological features in
heterogeneous networks
21 DMLLink Prediction DMLDMLLink Prediction21
Supervised Relationship Prediction
Feature Extraction
 Heterogeneous Features
• Based on heterogeneous structure of network
 Meta-Path based Features
• Uses the concepts of meta-paths
• Different meta paths represent different semantic
meanings
• Number of path instanced of a meta-path Φ
22 DMLLink Prediction DMLDMLLink Prediction22
Supervised Relationship Prediction
Case Study
 Social Link Prediction
23 DMLLink Prediction DMLDMLLink Prediction23
Supervised Relationship Prediction
Heterogeneous Features
 Social Features
• Common Neighbor
• Jaccard Coefficient
• Adamic/Adar
 Spatial Features
• Common Locations
• Jaccard Coefficient of Common Locations
• Average Geographic distance of locations
• …
24 DMLLink Prediction DMLDMLLink Prediction24
Supervised Relationship Prediction
Heterogeneous Features
 Temporal Features
• Let T(u) be 24 hour activity vector of user u
• Inner product of T(u) and T(v)
• Cosine similarity of T(u) and T(v)
 Text Content Features
• Let w(u) be the bag-of-words vector of user u
weighted by TF-IDF
• Inner product of w(u) and w(v)
• Cosine similarity of w(u) and w(v)
25 DMLLink Prediction DMLDMLLink Prediction25
Supervised Relationship Prediction
Meta-Path based Features
 Φ1: Follower of Follower
• UUU
 Φ2: Common Out Neighbor
• UUU
 Φ3: Common In Neighbor
• UUU
 Φ4: Common Words
• UPWPU
 Φ5: Common Timestamps
• UPTPU
 Φ6: Common Location Check-ins
• UPLPU
26 DMLLink Prediction DMLDMLLink Prediction26
Link Prediction in Heterogeneous Networks
Collective Link Prediction
 Conventional link
prediction approaches
assume that links are
independent identically
distributed (i.i.d).
 But in heterogeneous
networks, different type
of links are correlated
and mutually influential.
27 DMLLink Prediction DMLDMLLink Prediction27
Link Prediction in Heterogeneous Networks
Input Social Network
28 DMLLink Prediction DMLDMLLink Prediction28
Link Prediction in Heterogeneous Networks
Independent Social and Location Link Prediction
29 DMLLink Prediction DMLDMLLink Prediction29
Link Prediction in Heterogeneous Networks
Traditional Link Prediction
𝑌𝑠 = 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)
 𝐿 𝑠 and 𝐿𝑙 are the sets of potential social
and location links
 𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores
achieved when links in 𝐿 𝑠 are assigned
with Labels 𝑌𝑠
 𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
30 DMLLink Prediction DMLDMLLink Prediction30
Link Prediction in Heterogeneous Networks
Collective Link Prediction
𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌𝑠,𝑌 𝑙
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙)
× 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑠
(𝑡)
= 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡−1
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
𝑌𝑙
(𝑡)
= 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
IterativeSolution
31 DMLLink Prediction DMLDMLLink Prediction31
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
 Link Transfer
 Anchor Link Inference
Future Works
32 DMLLink Prediction DMLDMLLink Prediction32
Link Prediction in Aligned Networks
Multi Aligned Heterogeneous Social Networks
33 DMLLink Prediction DMLDMLLink Prediction33
Link Prediction in Aligned Networks
Link Transfer
 Information Sparsity
• New Network Problem
• Cold Start Problem
 Need to transfer knowledge from another
domain
• Transfer Learning
• Use aligned network as a source
 Context
• Fully aligned networks
• Partially aligned networks
34 DMLLink Prediction DMLDMLLink Prediction34
Link Prediction in Aligned Networks
Link Transfer across Fully Aligned Networks
 Prediction using only the target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡
 Prediction using source and target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠
= 𝑃 𝑦 𝑢 𝑡
, 𝑣 𝑡
= 1| 𝑥 𝑢 𝑡
, 𝑣 𝑡 𝑇
, 𝑥 𝑢 𝑠
, 𝑣 𝑠 𝑇
, 𝑦 𝑢 𝑠
, 𝑣 𝑠 𝑇
35 DMLLink Prediction DMLDMLLink Prediction35
Link Prediction in Aligned Networks
Link Transfer across Partially Aligned Networks
 Solution: Inter-Network Meta-Paths
 Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path
 Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
 Ψ3: Φ 𝑈 𝑡
, 𝑈 𝑡
− 𝛾 𝑈 𝑡
, 𝑈 𝑠
− Φ 𝑈 𝑠
, 𝑈 𝑠
−
𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
36 DMLLink Prediction DMLDMLLink Prediction36
Link Prediction in Aligned Networks
Anchor Link Inference
37 DMLLink Prediction DMLDMLLink Prediction37
Anchor Link Inference
Supervised Method
 Social Features
• Extended Common Neighbors
• Extended Jaccard Coefficient
• Extended Adamic/Adar
 Spatial Features
 Temporal Features
 Text Content Features
38 DMLLink Prediction DMLDMLLink Prediction38
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Predicted Scores
39 DMLLink Prediction DMLDMLLink Prediction39
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Conventional Link Prediction
40 DMLLink Prediction DMLDMLLink Prediction40
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Max sum of scores
41 DMLLink Prediction DMLDMLLink Prediction41
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Stable Matching
42 DMLLink Prediction DMLDMLLink Prediction42
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
43 DMLLink Prediction DMLDMLLink Prediction43
Future Works
Anchor Link Formation Prediction
 Predicting whether a user in the source network
will join the target network in the future
 This depends on the amount of influence he/she
receives from the target network
 An influence model can be learned using the
training data
 Positive-Unlabeled learning can improve the
prediction performance
44 DMLLink Prediction DMLDMLLink Prediction44
References
[1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A:
Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011.
[2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised
learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security,
2006.
[3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social
Networks: A Survey," 2014.
[4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship
prediction in heterogeneous bibliographic networks," in Advances in Social
Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011,
pp. 121-128.
[5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple
heterogeneous social networks," in Proceedings of the 22nd ACM international
conference on Conference on information & knowledge management, 2013, pp. 179-
188.
Q&A

Link Prediction in (Partially) Aligned Heterogeneous Social Networks

  • 1.
    Link Prediction in(Partially) Aligned Heterogeneous Social Networks By: Sina Sajadmanesh Advisor: Dr. Hamid Reza Rabiee
  • 2.
    2 DMLLink PredictionDMLDMLLink Prediction2 Outline Introduction  Problem Formulation  Applications Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 3.
    3 DMLLink PredictionDMLDMLLink Prediction3 Introduction Problem  Based on a snapshot of network, predicting the set of potential links to be formed in the future is formally defined as Link Prediction Problem.  First proposed by Liben- Novel and Klienberg in CIKM 2003
  • 4.
    4 DMLLink PredictionDML4 Applications Social networks and E-commerce  Recommender systems  Friend recommendation  Bioinformatics  Prioritization of candidate disease genes  Drug discovery  Security  Identify missing links between criminals  Controlling computer viruses
  • 5.
    5 DMLLink PredictionDMLDMLLink Prediction5 Outline Introduction Link Prediction in Homogeneous Networks  Unsupervised methods  Supervised methods Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 6.
    6 DMLLink PredictionDMLDMLLink Prediction6 Link Prediction in Homogeneous Networks Homogeneous Network 𝐺 = (𝑉, 𝐸)  If V contains one single type nodes and E contains one single type of links, then G is a homogeneous network
  • 7.
    7 DMLLink PredictionDMLDMLLink Prediction7 Link Prediction in Homogeneous Networks Unsupervised methods  Measuring the closeness among nodes  Assuming that close nodes are more likely to be connected Unsupervised Link Predicators  Local neighbor based predicators  Path based predicators  Random-Walk based predicators
  • 8.
    8 DMLLink PredictionDMLDMLLink Prediction8 Unsupervised Methods Local Neighbor based Link Predicators  Preferential Attachment 𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 9.
    9 DMLLink PredictionDMLDMLLink Prediction9 Unsupervised Methods Local Neighbor based Link Predicators  Common Neighbor 𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 10.
    10 DMLLink PredictionDMLDMLLink Prediction10 Unsupervised Methods Local Neighbor based Link Predicators  Jaccard Coefficient 𝐽𝐶 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) Γ(𝑢) ∪ Γ(𝑣)  Adamic/Adar 𝐴𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 log( Γ(𝑤) )  Resource Allocation 𝑅𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 Γ(𝑤)
  • 11.
    11 DMLLink PredictionDMLDMLLink Prediction11 Unsupervised Methods Path based Link Predicators  Shortest Path 𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }  Katz Score 𝐾𝑎𝑡𝑧 𝑢, 𝑣 = 𝑙=1 ∞ 𝛽 𝑙 𝑃𝑢↝𝑣 𝑙 = (𝐼 − 𝛽𝐴)−1−𝐼
  • 12.
    12 DMLLink PredictionDMLDMLLink Prediction12 Unsupervised Methods Random-Walk based Link Predicators  Hitting Time 𝐻𝑇 𝑢, 𝑣 = 1 + 𝑤∈Γ(𝑢) 𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)  Commute Time 𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
  • 13.
    13 DMLLink PredictionDMLDMLLink Prediction13 Link Prediction in Homogeneous Networks Other Unsupervised Methods  Matrix Factorization methods  Maximum Likelihood Methods  Probabilistic Methods
  • 14.
    14 DMLLink PredictionDMLDMLLink Prediction14 Link Prediction in Homogeneous Networks Supervised Link Prediction  Learning a binary classifier that will predict whether a link exists between a given pair of nodes or not.  First proposed by Hassan et. al in SDM 2006
  • 15.
    15 DMLLink PredictionDMLDMLLink Prediction15 Supervised Link Prediction Dataset  We need two snapshot of the network for training Positive Samples  Links that are missing in former snapshot, but are formed in latter snapshot Negative Samples  Links that are missing in both snapshots
  • 16.
    16 DMLLink PredictionDMLDMLLink Prediction16 Supervised Link Prediction Features  Topological Features  Attribute-based Features Classifiers  SVM  Decision Trees  Multilayer Perceptron  KNN  Naïve Bayes
  • 17.
    17 DMLLink PredictionDMLDMLLink Prediction17 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks  Relationship Prediction  Collective Link Prediction Link Prediction in Aligned Networks Future Works
  • 18.
    18 DMLLink PredictionDMLDMLLink Prediction18 Link Prediction in Heterogeneous Networks Heterogeneous Network 𝐺 = (𝑉, 𝐸)  𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G  𝐸 = 𝑗 𝐸𝑗 is the sets of various types of links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
  • 19.
    19 DMLLink PredictionDMLDMLLink Prediction19 Link Prediction in Heterogeneous Networks Heterogeneous Network Schema
  • 20.
    20 DMLLink PredictionDMLDMLLink Prediction20 Link Prediction in Heterogeneous Networks From Link Prediction to Relationship Prediction  A relationship between two objects could be a composition of two or more links • E.g. two authors have a co-author relationship if they have co-written a paper  Need to redesign topological features in heterogeneous networks
  • 21.
    21 DMLLink PredictionDMLDMLLink Prediction21 Supervised Relationship Prediction Feature Extraction  Heterogeneous Features • Based on heterogeneous structure of network  Meta-Path based Features • Uses the concepts of meta-paths • Different meta paths represent different semantic meanings • Number of path instanced of a meta-path Φ
  • 22.
    22 DMLLink PredictionDMLDMLLink Prediction22 Supervised Relationship Prediction Case Study  Social Link Prediction
  • 23.
    23 DMLLink PredictionDMLDMLLink Prediction23 Supervised Relationship Prediction Heterogeneous Features  Social Features • Common Neighbor • Jaccard Coefficient • Adamic/Adar  Spatial Features • Common Locations • Jaccard Coefficient of Common Locations • Average Geographic distance of locations • …
  • 24.
    24 DMLLink PredictionDMLDMLLink Prediction24 Supervised Relationship Prediction Heterogeneous Features  Temporal Features • Let T(u) be 24 hour activity vector of user u • Inner product of T(u) and T(v) • Cosine similarity of T(u) and T(v)  Text Content Features • Let w(u) be the bag-of-words vector of user u weighted by TF-IDF • Inner product of w(u) and w(v) • Cosine similarity of w(u) and w(v)
  • 25.
    25 DMLLink PredictionDMLDMLLink Prediction25 Supervised Relationship Prediction Meta-Path based Features  Φ1: Follower of Follower • UUU  Φ2: Common Out Neighbor • UUU  Φ3: Common In Neighbor • UUU  Φ4: Common Words • UPWPU  Φ5: Common Timestamps • UPTPU  Φ6: Common Location Check-ins • UPLPU
  • 26.
    26 DMLLink PredictionDMLDMLLink Prediction26 Link Prediction in Heterogeneous Networks Collective Link Prediction  Conventional link prediction approaches assume that links are independent identically distributed (i.i.d).  But in heterogeneous networks, different type of links are correlated and mutually influential.
  • 27.
    27 DMLLink PredictionDMLDMLLink Prediction27 Link Prediction in Heterogeneous Networks Input Social Network
  • 28.
    28 DMLLink PredictionDMLDMLLink Prediction28 Link Prediction in Heterogeneous Networks Independent Social and Location Link Prediction
  • 29.
    29 DMLLink PredictionDMLDMLLink Prediction29 Link Prediction in Heterogeneous Networks Traditional Link Prediction 𝑌𝑠 = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)  𝐿 𝑠 and 𝐿𝑙 are the sets of potential social and location links  𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores achieved when links in 𝐿 𝑠 are assigned with Labels 𝑌𝑠  𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
  • 30.
    30 DMLLink PredictionDMLDMLLink Prediction30 Link Prediction in Heterogeneous Networks Collective Link Prediction 𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌𝑠,𝑌 𝑙 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙) × 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑠 (𝑡) = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡−1 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) 𝑌𝑙 (𝑡) = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) IterativeSolution
  • 31.
    31 DMLLink PredictionDMLDMLLink Prediction31 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks  Link Transfer  Anchor Link Inference Future Works
  • 32.
    32 DMLLink PredictionDMLDMLLink Prediction32 Link Prediction in Aligned Networks Multi Aligned Heterogeneous Social Networks
  • 33.
    33 DMLLink PredictionDMLDMLLink Prediction33 Link Prediction in Aligned Networks Link Transfer  Information Sparsity • New Network Problem • Cold Start Problem  Need to transfer knowledge from another domain • Transfer Learning • Use aligned network as a source  Context • Fully aligned networks • Partially aligned networks
  • 34.
    34 DMLLink PredictionDMLDMLLink Prediction34 Link Prediction in Aligned Networks Link Transfer across Fully Aligned Networks  Prediction using only the target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡  Prediction using source and target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠 = 𝑃 𝑦 𝑢 𝑡 , 𝑣 𝑡 = 1| 𝑥 𝑢 𝑡 , 𝑣 𝑡 𝑇 , 𝑥 𝑢 𝑠 , 𝑣 𝑠 𝑇 , 𝑦 𝑢 𝑠 , 𝑣 𝑠 𝑇
  • 35.
    35 DMLLink PredictionDMLDMLLink Prediction35 Link Prediction in Aligned Networks Link Transfer across Partially Aligned Networks  Solution: Inter-Network Meta-Paths  Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path  Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡  Ψ3: Φ 𝑈 𝑡 , 𝑈 𝑡 − 𝛾 𝑈 𝑡 , 𝑈 𝑠 − Φ 𝑈 𝑠 , 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
  • 36.
    36 DMLLink PredictionDMLDMLLink Prediction36 Link Prediction in Aligned Networks Anchor Link Inference
  • 37.
    37 DMLLink PredictionDMLDMLLink Prediction37 Anchor Link Inference Supervised Method  Social Features • Extended Common Neighbors • Extended Jaccard Coefficient • Extended Adamic/Adar  Spatial Features  Temporal Features  Text Content Features
  • 38.
    38 DMLLink PredictionDMLDMLLink Prediction38 Anchor Link Inference Inference w.r.t One-to-One Constraint  Predicted Scores
  • 39.
    39 DMLLink PredictionDMLDMLLink Prediction39 Anchor Link Inference Inference w.r.t One-to-One Constraint  Conventional Link Prediction
  • 40.
    40 DMLLink PredictionDMLDMLLink Prediction40 Anchor Link Inference Inference w.r.t One-to-One Constraint  Max sum of scores
  • 41.
    41 DMLLink PredictionDMLDMLLink Prediction41 Anchor Link Inference Inference w.r.t One-to-One Constraint  Stable Matching
  • 42.
    42 DMLLink PredictionDMLDMLLink Prediction42 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 43.
    43 DMLLink PredictionDMLDMLLink Prediction43 Future Works Anchor Link Formation Prediction  Predicting whether a user in the source network will join the target network in the future  This depends on the amount of influence he/she receives from the target network  An influence model can be learned using the training data  Positive-Unlabeled learning can improve the prediction performance
  • 44.
    44 DMLLink PredictionDMLDMLLink Prediction44 References [1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011. [2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security, 2006. [3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social Networks: A Survey," 2014. [4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship prediction in heterogeneous bibliographic networks," in Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011, pp. 121-128. [5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple heterogeneous social networks," in Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, 2013, pp. 179- 188.
  • 45.