SlideShare a Scribd company logo
1 of 45
Link Prediction in (Partially)
Aligned Heterogeneous Social
Networks
By: Sina Sajadmanesh
Advisor: Dr. Hamid Reza Rabiee
2 DMLLink Prediction DMLDMLLink Prediction2
Outline
Introduction
 Problem Formulation
 Applications
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
3 DMLLink Prediction DMLDMLLink Prediction3
Introduction
Problem
 Based on a snapshot of
network, predicting the set
of potential links to be
formed in the future is
formally defined as Link
Prediction Problem.
 First proposed by Liben-
Novel and Klienberg in
CIKM 2003
4 DMLLink Prediction DML4
Applications
Social networks and E-commerce
 Recommender systems
 Friend recommendation
 Bioinformatics
 Prioritization of candidate disease genes
 Drug discovery
 Security
 Identify missing links between criminals
 Controlling computer viruses
5 DMLLink Prediction DMLDMLLink Prediction5
Outline
Introduction
Link Prediction in Homogeneous Networks
 Unsupervised methods
 Supervised methods
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
6 DMLLink Prediction DMLDMLLink Prediction6
Link Prediction in Homogeneous Networks
Homogeneous Network
𝐺 = (𝑉, 𝐸)
 If V contains one single type nodes and E
contains one single type of links, then G is
a homogeneous network
7 DMLLink Prediction DMLDMLLink Prediction7
Link Prediction in Homogeneous Networks
Unsupervised methods
 Measuring the closeness among nodes
 Assuming that close nodes are more likely
to be connected
Unsupervised Link Predicators
 Local neighbor based predicators
 Path based predicators
 Random-Walk based predicators
8 DMLLink Prediction DMLDMLLink Prediction8
Unsupervised Methods
Local Neighbor based Link Predicators
 Preferential Attachment
𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
9 DMLLink Prediction DMLDMLLink Prediction9
Unsupervised Methods
Local Neighbor based Link Predicators
 Common Neighbor
𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
10 DMLLink Prediction DMLDMLLink Prediction10
Unsupervised Methods
Local Neighbor based Link Predicators
 Jaccard Coefficient
𝐽𝐶 𝑢, 𝑣 =
Γ(𝑢) ∩ Γ(𝑣)
Γ(𝑢) ∪ Γ(𝑣)
 Adamic/Adar
𝐴𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
log( Γ(𝑤) )
 Resource Allocation
𝑅𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
Γ(𝑤)
11 DMLLink Prediction DMLDMLLink Prediction11
Unsupervised Methods
Path based Link Predicators
 Shortest Path
𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }
 Katz Score
𝐾𝑎𝑡𝑧 𝑢, 𝑣 =
𝑙=1
∞
𝛽 𝑙
𝑃𝑢↝𝑣
𝑙
= (𝐼 − 𝛽𝐴)−1−𝐼
12 DMLLink Prediction DMLDMLLink Prediction12
Unsupervised Methods
Random-Walk based Link Predicators
 Hitting Time
𝐻𝑇 𝑢, 𝑣 = 1 +
𝑤∈Γ(𝑢)
𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)
 Commute Time
𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
13 DMLLink Prediction DMLDMLLink Prediction13
Link Prediction in Homogeneous Networks
Other Unsupervised Methods
 Matrix Factorization methods
 Maximum Likelihood Methods
 Probabilistic Methods
14 DMLLink Prediction DMLDMLLink Prediction14
Link Prediction in Homogeneous Networks
Supervised Link Prediction
 Learning a binary classifier that will
predict whether a link exists between a
given pair of nodes or not.
 First proposed by Hassan et. al in SDM
2006
15 DMLLink Prediction DMLDMLLink Prediction15
Supervised Link Prediction
Dataset
 We need two snapshot of the network for
training
Positive Samples
 Links that are missing in former snapshot,
but are formed in latter snapshot
Negative Samples
 Links that are missing in both snapshots
16 DMLLink Prediction DMLDMLLink Prediction16
Supervised Link Prediction
Features
 Topological Features
 Attribute-based Features
Classifiers
 SVM
 Decision Trees
 Multilayer Perceptron
 KNN
 Naïve Bayes
17 DMLLink Prediction DMLDMLLink Prediction17
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
 Relationship Prediction
 Collective Link Prediction
Link Prediction in Aligned Networks
Future Works
18 DMLLink Prediction DMLDMLLink Prediction18
Link Prediction in Heterogeneous Networks
Heterogeneous Network
𝐺 = (𝑉, 𝐸)
 𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of
nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G
 𝐸 = 𝑗 𝐸𝑗 is the sets of various types of
links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
19 DMLLink Prediction DMLDMLLink Prediction19
Link Prediction in Heterogeneous Networks
Heterogeneous Network Schema
20 DMLLink Prediction DMLDMLLink Prediction20
Link Prediction in Heterogeneous Networks
From Link Prediction to Relationship Prediction
 A relationship between two objects could be
a composition of two or more links
• E.g. two authors have a co-author relationship if
they have co-written a paper
 Need to redesign topological features in
heterogeneous networks
21 DMLLink Prediction DMLDMLLink Prediction21
Supervised Relationship Prediction
Feature Extraction
 Heterogeneous Features
• Based on heterogeneous structure of network
 Meta-Path based Features
• Uses the concepts of meta-paths
• Different meta paths represent different semantic
meanings
• Number of path instanced of a meta-path Φ
22 DMLLink Prediction DMLDMLLink Prediction22
Supervised Relationship Prediction
Case Study
 Social Link Prediction
23 DMLLink Prediction DMLDMLLink Prediction23
Supervised Relationship Prediction
Heterogeneous Features
 Social Features
• Common Neighbor
• Jaccard Coefficient
• Adamic/Adar
 Spatial Features
• Common Locations
• Jaccard Coefficient of Common Locations
• Average Geographic distance of locations
• …
24 DMLLink Prediction DMLDMLLink Prediction24
Supervised Relationship Prediction
Heterogeneous Features
 Temporal Features
• Let T(u) be 24 hour activity vector of user u
• Inner product of T(u) and T(v)
• Cosine similarity of T(u) and T(v)
 Text Content Features
• Let w(u) be the bag-of-words vector of user u
weighted by TF-IDF
• Inner product of w(u) and w(v)
• Cosine similarity of w(u) and w(v)
25 DMLLink Prediction DMLDMLLink Prediction25
Supervised Relationship Prediction
Meta-Path based Features
 Φ1: Follower of Follower
• UUU
 Φ2: Common Out Neighbor
• UUU
 Φ3: Common In Neighbor
• UUU
 Φ4: Common Words
• UPWPU
 Φ5: Common Timestamps
• UPTPU
 Φ6: Common Location Check-ins
• UPLPU
26 DMLLink Prediction DMLDMLLink Prediction26
Link Prediction in Heterogeneous Networks
Collective Link Prediction
 Conventional link
prediction approaches
assume that links are
independent identically
distributed (i.i.d).
 But in heterogeneous
networks, different type
of links are correlated
and mutually influential.
27 DMLLink Prediction DMLDMLLink Prediction27
Link Prediction in Heterogeneous Networks
Input Social Network
28 DMLLink Prediction DMLDMLLink Prediction28
Link Prediction in Heterogeneous Networks
Independent Social and Location Link Prediction
29 DMLLink Prediction DMLDMLLink Prediction29
Link Prediction in Heterogeneous Networks
Traditional Link Prediction
𝑌𝑠 = 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)
 𝐿 𝑠 and 𝐿𝑙 are the sets of potential social
and location links
 𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores
achieved when links in 𝐿 𝑠 are assigned
with Labels 𝑌𝑠
 𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
30 DMLLink Prediction DMLDMLLink Prediction30
Link Prediction in Heterogeneous Networks
Collective Link Prediction
𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌𝑠,𝑌 𝑙
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙)
× 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑠
(𝑡)
= 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡−1
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
𝑌𝑙
(𝑡)
= 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
IterativeSolution
31 DMLLink Prediction DMLDMLLink Prediction31
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
 Link Transfer
 Anchor Link Inference
Future Works
32 DMLLink Prediction DMLDMLLink Prediction32
Link Prediction in Aligned Networks
Multi Aligned Heterogeneous Social Networks
33 DMLLink Prediction DMLDMLLink Prediction33
Link Prediction in Aligned Networks
Link Transfer
 Information Sparsity
• New Network Problem
• Cold Start Problem
 Need to transfer knowledge from another
domain
• Transfer Learning
• Use aligned network as a source
 Context
• Fully aligned networks
• Partially aligned networks
34 DMLLink Prediction DMLDMLLink Prediction34
Link Prediction in Aligned Networks
Link Transfer across Fully Aligned Networks
 Prediction using only the target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡
 Prediction using source and target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠
= 𝑃 𝑦 𝑢 𝑡
, 𝑣 𝑡
= 1| 𝑥 𝑢 𝑡
, 𝑣 𝑡 𝑇
, 𝑥 𝑢 𝑠
, 𝑣 𝑠 𝑇
, 𝑦 𝑢 𝑠
, 𝑣 𝑠 𝑇
35 DMLLink Prediction DMLDMLLink Prediction35
Link Prediction in Aligned Networks
Link Transfer across Partially Aligned Networks
 Solution: Inter-Network Meta-Paths
 Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path
 Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
 Ψ3: Φ 𝑈 𝑡
, 𝑈 𝑡
− 𝛾 𝑈 𝑡
, 𝑈 𝑠
− Φ 𝑈 𝑠
, 𝑈 𝑠
−
𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
36 DMLLink Prediction DMLDMLLink Prediction36
Link Prediction in Aligned Networks
Anchor Link Inference
37 DMLLink Prediction DMLDMLLink Prediction37
Anchor Link Inference
Supervised Method
 Social Features
• Extended Common Neighbors
• Extended Jaccard Coefficient
• Extended Adamic/Adar
 Spatial Features
 Temporal Features
 Text Content Features
38 DMLLink Prediction DMLDMLLink Prediction38
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Predicted Scores
39 DMLLink Prediction DMLDMLLink Prediction39
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Conventional Link Prediction
40 DMLLink Prediction DMLDMLLink Prediction40
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Max sum of scores
41 DMLLink Prediction DMLDMLLink Prediction41
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Stable Matching
42 DMLLink Prediction DMLDMLLink Prediction42
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
43 DMLLink Prediction DMLDMLLink Prediction43
Future Works
Anchor Link Formation Prediction
 Predicting whether a user in the source network
will join the target network in the future
 This depends on the amount of influence he/she
receives from the target network
 An influence model can be learned using the
training data
 Positive-Unlabeled learning can improve the
prediction performance
44 DMLLink Prediction DMLDMLLink Prediction44
References
[1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A:
Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011.
[2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised
learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security,
2006.
[3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social
Networks: A Survey," 2014.
[4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship
prediction in heterogeneous bibliographic networks," in Advances in Social
Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011,
pp. 121-128.
[5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple
heterogeneous social networks," in Proceedings of the 22nd ACM international
conference on Conference on information & knowledge management, 2013, pp. 179-
188.
Q&A

More Related Content

What's hot

Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksScale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksIOSR Journals
 
Finding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankFinding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankcsandit
 
An Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksAn Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksSubhajit Sahu
 
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...CSCJournals
 
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
 
Social network analysis
Social network analysisSocial network analysis
Social network analysisSohom Ghosh
 
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...IJCNCJournal
 
Clustering Algorithms for Data Stream
Clustering Algorithms for Data StreamClustering Algorithms for Data Stream
Clustering Algorithms for Data StreamIRJET Journal
 
Sub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula NetworksSub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula Networksijceronline
 
Link Prediction in Evolving Networks Base on Information Propagation Screenshots
Link Prediction in Evolving Networks Base on Information Propagation ScreenshotsLink Prediction in Evolving Networks Base on Information Propagation Screenshots
Link Prediction in Evolving Networks Base on Information Propagation ScreenshotsVenkat Projects
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...PyData
 
Network embedding
Network embeddingNetwork embedding
Network embeddingSOYEON KIM
 
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...ijsrd.com
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelEswar Publications
 

What's hot (16)

Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksScale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
 
Finding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankFinding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerank
 
An Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer NetworksAn Improved PageRank Algorithm for Multilayer Networks
An Improved PageRank Algorithm for Multilayer Networks
 
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
 
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
 
Network Flow
Network FlowNetwork Flow
Network Flow
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
 
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
 
Clustering Algorithms for Data Stream
Clustering Algorithms for Data StreamClustering Algorithms for Data Stream
Clustering Algorithms for Data Stream
 
Sub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula NetworksSub-Graph Finding Information over Nebula Networks
Sub-Graph Finding Information over Nebula Networks
 
Link Prediction in Evolving Networks Base on Information Propagation Screenshots
Link Prediction in Evolving Networks Base on Information Propagation ScreenshotsLink Prediction in Evolving Networks Base on Information Propagation Screenshots
Link Prediction in Evolving Networks Base on Information Propagation Screenshots
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
 
Network embedding
Network embeddingNetwork embedding
Network embedding
 
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
 

Similar to Link Prediction in Aligned Heterogeneous Social Networks

2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...Xi Wang
 
LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...TELKOMNIKA JOURNAL
 
Chapter 10 link prediction
Chapter 10 link predictionChapter 10 link prediction
Chapter 10 link predictionAbanobZakaria1
 
MSCX2023_Sergio Gomez_PartI
MSCX2023_Sergio Gomez_PartIMSCX2023_Sergio Gomez_PartI
MSCX2023_Sergio Gomez_PartImscx
 
Clique-based Network Clustering
Clique-based Network ClusteringClique-based Network Clustering
Clique-based Network ClusteringGuang Ouyang
 
Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Kolja Kleineberg
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processingjins0618
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Thang Nguyen
 
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphUsing Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphRui Wang
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
Centrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social NetworksCentrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social NetworksIJERA Editor
 
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
NS-CUK Seminar: H.E.Lee,  Review on "Structural Deep Embedding for Hyper-Netw...NS-CUK Seminar: H.E.Lee,  Review on "Structural Deep Embedding for Hyper-Netw...
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...ssuser4b1f48
 
IRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET Journal
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AINeo4j
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression networktuxette
 
Synthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingSynthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingGraph-TA
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...ssuser4b1f48
 

Similar to Link Prediction in Aligned Heterogeneous Social Networks (20)

2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
2013 KDD conference presentation--"Multi-Label Relational Neighbor Classifica...
 
LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...LPCNN: convolutional neural network for link prediction based on network stru...
LPCNN: convolutional neural network for link prediction based on network stru...
 
Chapter 10 link prediction
Chapter 10 link predictionChapter 10 link prediction
Chapter 10 link prediction
 
MSCX2023_Sergio Gomez_PartI
MSCX2023_Sergio Gomez_PartIMSCX2023_Sergio Gomez_PartI
MSCX2023_Sergio Gomez_PartI
 
Clique-based Network Clustering
Clique-based Network ClusteringClique-based Network Clustering
Clique-based Network Clustering
 
Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...Towards controlling evolutionary dynamics through network geometry: some very...
Towards controlling evolutionary dynamics through network geometry: some very...
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
 
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphUsing Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Centrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social NetworksCentrality Prediction in Mobile Social Networks
Centrality Prediction in Mobile Social Networks
 
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
NS-CUK Seminar: H.E.Lee,  Review on "Structural Deep Embedding for Hyper-Netw...NS-CUK Seminar: H.E.Lee,  Review on "Structural Deep Embedding for Hyper-Netw...
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
 
IRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social Networks
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AI
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression network
 
Synthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingSynthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modeling
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
 

Link Prediction in Aligned Heterogeneous Social Networks

  • 1. Link Prediction in (Partially) Aligned Heterogeneous Social Networks By: Sina Sajadmanesh Advisor: Dr. Hamid Reza Rabiee
  • 2. 2 DMLLink Prediction DMLDMLLink Prediction2 Outline Introduction  Problem Formulation  Applications Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 3. 3 DMLLink Prediction DMLDMLLink Prediction3 Introduction Problem  Based on a snapshot of network, predicting the set of potential links to be formed in the future is formally defined as Link Prediction Problem.  First proposed by Liben- Novel and Klienberg in CIKM 2003
  • 4. 4 DMLLink Prediction DML4 Applications Social networks and E-commerce  Recommender systems  Friend recommendation  Bioinformatics  Prioritization of candidate disease genes  Drug discovery  Security  Identify missing links between criminals  Controlling computer viruses
  • 5. 5 DMLLink Prediction DMLDMLLink Prediction5 Outline Introduction Link Prediction in Homogeneous Networks  Unsupervised methods  Supervised methods Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 6. 6 DMLLink Prediction DMLDMLLink Prediction6 Link Prediction in Homogeneous Networks Homogeneous Network 𝐺 = (𝑉, 𝐸)  If V contains one single type nodes and E contains one single type of links, then G is a homogeneous network
  • 7. 7 DMLLink Prediction DMLDMLLink Prediction7 Link Prediction in Homogeneous Networks Unsupervised methods  Measuring the closeness among nodes  Assuming that close nodes are more likely to be connected Unsupervised Link Predicators  Local neighbor based predicators  Path based predicators  Random-Walk based predicators
  • 8. 8 DMLLink Prediction DMLDMLLink Prediction8 Unsupervised Methods Local Neighbor based Link Predicators  Preferential Attachment 𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 9. 9 DMLLink Prediction DMLDMLLink Prediction9 Unsupervised Methods Local Neighbor based Link Predicators  Common Neighbor 𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 10. 10 DMLLink Prediction DMLDMLLink Prediction10 Unsupervised Methods Local Neighbor based Link Predicators  Jaccard Coefficient 𝐽𝐶 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) Γ(𝑢) ∪ Γ(𝑣)  Adamic/Adar 𝐴𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 log( Γ(𝑤) )  Resource Allocation 𝑅𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 Γ(𝑤)
  • 11. 11 DMLLink Prediction DMLDMLLink Prediction11 Unsupervised Methods Path based Link Predicators  Shortest Path 𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }  Katz Score 𝐾𝑎𝑡𝑧 𝑢, 𝑣 = 𝑙=1 ∞ 𝛽 𝑙 𝑃𝑢↝𝑣 𝑙 = (𝐼 − 𝛽𝐴)−1−𝐼
  • 12. 12 DMLLink Prediction DMLDMLLink Prediction12 Unsupervised Methods Random-Walk based Link Predicators  Hitting Time 𝐻𝑇 𝑢, 𝑣 = 1 + 𝑤∈Γ(𝑢) 𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)  Commute Time 𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
  • 13. 13 DMLLink Prediction DMLDMLLink Prediction13 Link Prediction in Homogeneous Networks Other Unsupervised Methods  Matrix Factorization methods  Maximum Likelihood Methods  Probabilistic Methods
  • 14. 14 DMLLink Prediction DMLDMLLink Prediction14 Link Prediction in Homogeneous Networks Supervised Link Prediction  Learning a binary classifier that will predict whether a link exists between a given pair of nodes or not.  First proposed by Hassan et. al in SDM 2006
  • 15. 15 DMLLink Prediction DMLDMLLink Prediction15 Supervised Link Prediction Dataset  We need two snapshot of the network for training Positive Samples  Links that are missing in former snapshot, but are formed in latter snapshot Negative Samples  Links that are missing in both snapshots
  • 16. 16 DMLLink Prediction DMLDMLLink Prediction16 Supervised Link Prediction Features  Topological Features  Attribute-based Features Classifiers  SVM  Decision Trees  Multilayer Perceptron  KNN  Naïve Bayes
  • 17. 17 DMLLink Prediction DMLDMLLink Prediction17 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks  Relationship Prediction  Collective Link Prediction Link Prediction in Aligned Networks Future Works
  • 18. 18 DMLLink Prediction DMLDMLLink Prediction18 Link Prediction in Heterogeneous Networks Heterogeneous Network 𝐺 = (𝑉, 𝐸)  𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G  𝐸 = 𝑗 𝐸𝑗 is the sets of various types of links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
  • 19. 19 DMLLink Prediction DMLDMLLink Prediction19 Link Prediction in Heterogeneous Networks Heterogeneous Network Schema
  • 20. 20 DMLLink Prediction DMLDMLLink Prediction20 Link Prediction in Heterogeneous Networks From Link Prediction to Relationship Prediction  A relationship between two objects could be a composition of two or more links • E.g. two authors have a co-author relationship if they have co-written a paper  Need to redesign topological features in heterogeneous networks
  • 21. 21 DMLLink Prediction DMLDMLLink Prediction21 Supervised Relationship Prediction Feature Extraction  Heterogeneous Features • Based on heterogeneous structure of network  Meta-Path based Features • Uses the concepts of meta-paths • Different meta paths represent different semantic meanings • Number of path instanced of a meta-path Φ
  • 22. 22 DMLLink Prediction DMLDMLLink Prediction22 Supervised Relationship Prediction Case Study  Social Link Prediction
  • 23. 23 DMLLink Prediction DMLDMLLink Prediction23 Supervised Relationship Prediction Heterogeneous Features  Social Features • Common Neighbor • Jaccard Coefficient • Adamic/Adar  Spatial Features • Common Locations • Jaccard Coefficient of Common Locations • Average Geographic distance of locations • …
  • 24. 24 DMLLink Prediction DMLDMLLink Prediction24 Supervised Relationship Prediction Heterogeneous Features  Temporal Features • Let T(u) be 24 hour activity vector of user u • Inner product of T(u) and T(v) • Cosine similarity of T(u) and T(v)  Text Content Features • Let w(u) be the bag-of-words vector of user u weighted by TF-IDF • Inner product of w(u) and w(v) • Cosine similarity of w(u) and w(v)
  • 25. 25 DMLLink Prediction DMLDMLLink Prediction25 Supervised Relationship Prediction Meta-Path based Features  Φ1: Follower of Follower • UUU  Φ2: Common Out Neighbor • UUU  Φ3: Common In Neighbor • UUU  Φ4: Common Words • UPWPU  Φ5: Common Timestamps • UPTPU  Φ6: Common Location Check-ins • UPLPU
  • 26. 26 DMLLink Prediction DMLDMLLink Prediction26 Link Prediction in Heterogeneous Networks Collective Link Prediction  Conventional link prediction approaches assume that links are independent identically distributed (i.i.d).  But in heterogeneous networks, different type of links are correlated and mutually influential.
  • 27. 27 DMLLink Prediction DMLDMLLink Prediction27 Link Prediction in Heterogeneous Networks Input Social Network
  • 28. 28 DMLLink Prediction DMLDMLLink Prediction28 Link Prediction in Heterogeneous Networks Independent Social and Location Link Prediction
  • 29. 29 DMLLink Prediction DMLDMLLink Prediction29 Link Prediction in Heterogeneous Networks Traditional Link Prediction 𝑌𝑠 = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)  𝐿 𝑠 and 𝐿𝑙 are the sets of potential social and location links  𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores achieved when links in 𝐿 𝑠 are assigned with Labels 𝑌𝑠  𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
  • 30. 30 DMLLink Prediction DMLDMLLink Prediction30 Link Prediction in Heterogeneous Networks Collective Link Prediction 𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌𝑠,𝑌 𝑙 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙) × 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑠 (𝑡) = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡−1 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) 𝑌𝑙 (𝑡) = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) IterativeSolution
  • 31. 31 DMLLink Prediction DMLDMLLink Prediction31 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks  Link Transfer  Anchor Link Inference Future Works
  • 32. 32 DMLLink Prediction DMLDMLLink Prediction32 Link Prediction in Aligned Networks Multi Aligned Heterogeneous Social Networks
  • 33. 33 DMLLink Prediction DMLDMLLink Prediction33 Link Prediction in Aligned Networks Link Transfer  Information Sparsity • New Network Problem • Cold Start Problem  Need to transfer knowledge from another domain • Transfer Learning • Use aligned network as a source  Context • Fully aligned networks • Partially aligned networks
  • 34. 34 DMLLink Prediction DMLDMLLink Prediction34 Link Prediction in Aligned Networks Link Transfer across Fully Aligned Networks  Prediction using only the target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡  Prediction using source and target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠 = 𝑃 𝑦 𝑢 𝑡 , 𝑣 𝑡 = 1| 𝑥 𝑢 𝑡 , 𝑣 𝑡 𝑇 , 𝑥 𝑢 𝑠 , 𝑣 𝑠 𝑇 , 𝑦 𝑢 𝑠 , 𝑣 𝑠 𝑇
  • 35. 35 DMLLink Prediction DMLDMLLink Prediction35 Link Prediction in Aligned Networks Link Transfer across Partially Aligned Networks  Solution: Inter-Network Meta-Paths  Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path  Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡  Ψ3: Φ 𝑈 𝑡 , 𝑈 𝑡 − 𝛾 𝑈 𝑡 , 𝑈 𝑠 − Φ 𝑈 𝑠 , 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
  • 36. 36 DMLLink Prediction DMLDMLLink Prediction36 Link Prediction in Aligned Networks Anchor Link Inference
  • 37. 37 DMLLink Prediction DMLDMLLink Prediction37 Anchor Link Inference Supervised Method  Social Features • Extended Common Neighbors • Extended Jaccard Coefficient • Extended Adamic/Adar  Spatial Features  Temporal Features  Text Content Features
  • 38. 38 DMLLink Prediction DMLDMLLink Prediction38 Anchor Link Inference Inference w.r.t One-to-One Constraint  Predicted Scores
  • 39. 39 DMLLink Prediction DMLDMLLink Prediction39 Anchor Link Inference Inference w.r.t One-to-One Constraint  Conventional Link Prediction
  • 40. 40 DMLLink Prediction DMLDMLLink Prediction40 Anchor Link Inference Inference w.r.t One-to-One Constraint  Max sum of scores
  • 41. 41 DMLLink Prediction DMLDMLLink Prediction41 Anchor Link Inference Inference w.r.t One-to-One Constraint  Stable Matching
  • 42. 42 DMLLink Prediction DMLDMLLink Prediction42 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 43. 43 DMLLink Prediction DMLDMLLink Prediction43 Future Works Anchor Link Formation Prediction  Predicting whether a user in the source network will join the target network in the future  This depends on the amount of influence he/she receives from the target network  An influence model can be learned using the training data  Positive-Unlabeled learning can improve the prediction performance
  • 44. 44 DMLLink Prediction DMLDMLLink Prediction44 References [1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011. [2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security, 2006. [3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social Networks: A Survey," 2014. [4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship prediction in heterogeneous bibliographic networks," in Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011, pp. 121-128. [5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple heterogeneous social networks," in Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, 2013, pp. 179- 188.
  • 45. Q&A