SlideShare a Scribd company logo
1 of 33
Download to read offline
Representation Learning on Graphs with
Complex Structures
Prof. Dr. Philippe Cudré-Mauroux
eXascale Infolab, U. of Fribourg–Switzerland
DL4G-SDE @ WWW2019
San Francisco, May 13, 2019
Representation Learning on Graphs
■ Projecting nodes of a graph onto a vector space while preserving key
structural properties of the graph (e.g., topological proximity of the nodes)
8/5/192 WWW2019@San Francisco
Neural embedding
techniques
(e.g.word2vec)
…
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
1
Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." In Proceedings of the 20th ACM SIGKDD
international conference on Knowledge discovery and data mining, pp. 701-710. ACM, 2014.
DeepWalk1
8/5/193 WWW2019@San Francisco
What if the graph at hand exhibits
a much more complex structure?
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/194 WWW2019@San Francisco
Heterogeneous Graphs
■ Heterogeneous Graphs contain multiple node types:
● Homogeneous edges: linking nodes from the same domain
● Heterogeneous edges: linking nodes across different domains
8/5/195 WWW2019@San Francisco
Meta-Paths in Heterogeneous Graphs
■ A meta-path is a sequence of node types encoding key composite relations among the
involved node types.
■ Meta-paths are used to guide random walks to redefine the neighborhood of a node.
8/5/196 WWW2019@San Francisco
1
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144.
Metapath2vec1
Neural embedding
techniques
(e.g.word2vec)
…
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
Challenges with Meta-Paths
■ The choice of meta-paths highly affects the quality of the learnt node
embeddings for a specific task.
■ How to select meta-paths ?
● Graph specific and highly depends on prior knowledge from domain experts.
● Strategies to combine a set of meta-paths can be complex and computationally
expensive.
8/5/197 WWW2019@San Francisco
Are meta-paths necessary?
8/5/198 WWW2019@San Francisco
JUST: Embedding Heterogeneous Graphs without Meta-Paths
■ Random Walk with JUmp and STay strategies to probabilistically control the
random walk.
■ 2 ways to balance the random walk:
● Step I: Jump or stay?
−Objective: Balance the number of heterogeneous and homogeneous edges traversed during
random walks (stay with probability 𝝰, exponential decay).
● Step II: If Jump, where to Jump?
−Objective: Control the randomness in choosing a target domain
(memory window to favor diversity).
■ Learn node embeddings with SkipGram model.
8/5/199 WWW2019@San Francisco
Results
8/5/1910 WWW2019@San Francisco
JUST achieves state-of-the-art performance without using meta-paths.
Node classification results
Runtime Performance
■ End-to-end node embedding learning time for all random-walk based
methods in seconds.
8/5/1911 WWW2019@San Francisco
DBLP Movie Foursquare
DeepWalk 236 333 484
Metapath2vec (original) 965 19,200 2,248
Metapath2vec (ours) 290 408 550
Hin2vec 904 1,301 1,801
JUST 310 442 616
• Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time, but achieves
better results in classification and clustering tasks.
• Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better results in most
experiments.
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/1912 WWW2019@San Francisco
Social Relationships v.s. Human Mobility
8/5/1913 WWW2019@San Francisco
8/5/1914 WWW2019@San Francisco
How to quantify the impact of social relationships and
mobility on each other?
● Two types of links
−Friendships
−Check-ins (Hyperedges)
Location Based Social Networks
■A hypergraph with
● Four data domains
8/5/1915 WWW2019@San Francisco
Spatial
- POI
Temporal
- Time slot
Semantic
- Activity category
Social
- User
Hypergraph Embedding
8/5/1916 WWW2019@San Francisco
0.19 0.32 1.89 1.21 0.87
0.67 0.45 1.76 1.42 0.98
1.32 0.77 1.11 1.29 1.31
045 0.89 1.56 0.02 0.79
…
Graph embedding
Neural embedding
techniques
(e.g. SkipGram)
1. How to sample from a
LBSN hypergraph?
2. How to preserve n-wise
proximity from Hyperedges?
1. Sample from A Hypergraph: Random Walk with Stay
■ Balancing the impact of social and mobility on the learnt embeddings
8/5/1917 WWW2019@San Francisco
Sample and learn from
• A check-in hyperedge with probability 𝛼
• A user-user pair with probability (1-𝛼)
2. Learn from Hyperedges: Learning via Best-Fit-Line
■ Maximizing the similarity between the nodes of a hyperedge and their
best-fit-line under cosine similarity.
8/5/1918 WWW2019@San Francisco
1. Compute the best-fit-line
2. Maximize the cosine similarity between each node
and the best-fit-line
Task I: Friendship Prediction
■ Comparison with other graph embedding techniques
● (S) Social network only
● (S&M) Social and mobility through clique expansion
8/5/1919 WWW2019@San Francisco
↑ 32.95% on
precision@10
Clique expansion
Task II: Location Prediction
■ Comparison with other graph embedding techniques
● (M) Mobility (Check-in) network only
● (S&M) Social and mobility through clique expansion
8/5/1920 WWW2019@San Francisco
↑ 25.32% on
accuracy@10
8/5/19 WWW2019@San Francisco21
Balancing the Impact of Social Relationships and Mobility Matters!
Asymmetric impact of mobility and social relationships on predicting each other:
• Friendship prediction: 80% social and 20% mobility data
• Location prediction: 60% social and 40% mobility data
Outlines
■ JUST: Embedding heterogeneous graphs without meta-paths
[CIKM’18]
■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs
[WWW’19]
■ NodeSketch: Highly-efficient graph embeddings via recursive
sketching [KDD’19]
8/5/1922 WWW2019@San Francisco
Graph Embeddings
■ Graph-sampling based techniques
● Sample node pairs from a graph, and preserve node proximity from the node pairs
● Examples: DeepWalk, Node2Vec, LINE, SDNE and VERSE, etc.
● Efficiency bottleneck: A large number of node pairs -> significant computation resources (CPU time)
■ Factorization based techniques
● Factorize a (transformed, e.g., high-order) proximity/adjacency matrix of a graph
● Examples: GraRep, HOPE and NetMF, etc.
● Efficiency bottleneck: Large matrix factorization -> significant computation resources (both CPU time and
RAM)
■ Node proximity preserved using cosine similarity
● Efficiency bottleneck: cosine similarity is less efficient than hamming similarity, for example.
8/5/1923 WWW2019@San Francisco
Similarity-Preserving Hashing/Sketching
■ Efficient similarity approximation of high dimensional data
● Data-dependent hashing (learning-to-hash)
−Learning dataset-specific hashing functions
−Examples: spectral hashing, iterative quantization, etc.
−Efficient in similarity computation, but requires learning hashing functions
● Data-independent hashing/sketching (locality sensitive hashing)
−Hashing without involving any learning process from data
−Examples: minhash, consistent weighted sampling, etc.
−Efficient in both similarity approximation and hashing
8/5/1924 WWW2019@San Francisco
Can we sketch nodes in a graph as embeddings?
8/5/1925 WWW2019@San Francisco
Preliminary: Consistent Weighted Sampling1
■ Principled techniques for highly-efficient similarity approximation
8/5/1926 WWW2019@San Francisco
The min-max similarity
between original data
Can be approximated by the
Hamming similarity between
sketches
1.32 2.77 1.11 3.29 1.31V
Sketch S = S1 … Sj … SL
D=5 Random hash
function hj , j=1…,L.
1
Dingqi Yang, Bin Li, Rettig Laura, Philippe Cudré-Mauroux, D2HistoSketch: Discriminative and Dynamic Similarity-Preserving Sketching of Streaming Histograms,
IEEE Transactions on Knowledge and Data Engineering (TKDE) 2018
Sketching the Adjacency Matrix ?
■ Adjacency matrix v.s. Self-Loop-Augmented (SLA) adjacency matrix
8/5/1927 WWW2019@San Francisco
NodeSketch: Low-Order Node Embeddings
8/5/1928 WWW2019@San Francisco
1
2
3
4 5
NodeSketch: High-Order Node Embeddings
8/5/1929 WWW2019@San Francisco
1 1
0.33 0.33 0.33
Neighbors
𝒏 ∈ 𝜞 𝒓
Node 2 2 3 1
SLA adjacency vector '𝑽 𝒓
Sketch element distribution
𝟏
𝑳
∑𝒋-𝟏
𝑳
𝕝[𝑺 𝒋
𝒏
𝒌2𝟏 -𝒊], 𝑖=1,..,D
1.066 1.066 0.066
Approximate 𝑘-order
SLA adjacency vector '𝑽 𝒓
(𝒌)
node 1
Sketching using Eq. 3
*Weight
α=0.2
Merge
1 1
1 1 1
1 1 1 1
1 1
1 1
SLA adjacency
matrix '𝑨
2 1 1
2 3 1
2 3 4
4 3 4
5 3 5
(𝑘-1)-order node
embeddings 𝑺(𝒌 − 𝟏)
𝑘-order
embeddings 𝑺(𝒌)
2 1 3
2 3 4
2 3 4
2 3 4
4 3 5
(𝑘-1)-order Sketches
𝑺 𝒏
(𝒌 − 𝟏)
… … …
Uniformity of the generated samples:
The foundation of our recursive sketching process
1
2
3
4 5
Results: Node Classification Performance using Kernel SVM
8/5/1930 WWW2019@San Francisco
Classical graph
embedding techniques
(preserving cosine
similarity)
Learning-to-hash
techniques
Sketching
techniques
NodeSketch shows comparable performance to the best-performing state-of-the-art techniques.
Results: Runtime Performance
8/5/1931 WWW2019@San Francisco
NodeSketch is highly-efficient, and significantly
outperforms all baselines, showing 9x-273x speedup.
Hamming similarity also shows improved efficiency (1.19x-
1.68x speedup) over cosine similarity.
Take-Away Messages
■ JUST: Meta-path free heterogeneous graph embedding can achieve state-
of-the-art performance efficiently. [CIKM’18]
■ LBSN2Vec: Asymmetric impact of social and mobility on each other
[WWW’19]
■ NodeSketch: High-quality node embeddings can be generated via highly-
efficient sketching techniques [KDD’19]
8/5/1932 WWW2019@San Francisco
[CIKM’18] Hussein, Rana, Dingqi Yang, and Philippe Cudré-Mauroux. "Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings." CIKM’18.
[WWW’19] Dingqi Yang, Bingqing Qu, Jie Yang, Philippe Cudre-Mauroux, ”Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach.” WWW’19.
[KDD’19] Dingqi Yang, Paolo Rosso, Bin Li and Philippe Cudre-Mauroux, “NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching.” KDD’19.
Future Plan for Representation Learning on Graphs
■ Attributed graph structure (e.g., property graphs)
■ Heterogeneous data structures (e.g., structured knowledge graph + unstructured text)
■ Dynamic graphs (e.g., streaming LBSN graphs)
4/29/19 Dingqi's job talk @ University of Luxembourg33

More Related Content

What's hot

Protocols for wireless sensor networks
Protocols for wireless sensor networks Protocols for wireless sensor networks
Protocols for wireless sensor networks DEBABRATASINGH3
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
mobile ad-hoc network (MANET) and its applications
mobile ad-hoc network (MANET) and its applicationsmobile ad-hoc network (MANET) and its applications
mobile ad-hoc network (MANET) and its applicationsAman Gupta
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel TrickEdgar Marca
 
Load balancing in cloud computing.pptx
Load balancing in cloud computing.pptxLoad balancing in cloud computing.pptx
Load balancing in cloud computing.pptxHitesh Mohapatra
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormMd. Shamsur Rahim
 
Building Topology in NS3
Building Topology in NS3Building Topology in NS3
Building Topology in NS3Rahul Hada
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Wireless sensor network and its application
Wireless sensor network and its applicationWireless sensor network and its application
Wireless sensor network and its applicationRoma Vyas
 
Fisheye State Routing (FSR) - Protocol Overview
Fisheye State Routing (FSR) - Protocol OverviewFisheye State Routing (FSR) - Protocol Overview
Fisheye State Routing (FSR) - Protocol OverviewYoav Francis
 

What's hot (20)

Clique
Clique Clique
Clique
 
Protocols for wireless sensor networks
Protocols for wireless sensor networks Protocols for wireless sensor networks
Protocols for wireless sensor networks
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
Light trees
Light treesLight trees
Light trees
 
Ad hoc networks
Ad hoc networksAd hoc networks
Ad hoc networks
 
mobile ad-hoc network (MANET) and its applications
mobile ad-hoc network (MANET) and its applicationsmobile ad-hoc network (MANET) and its applications
mobile ad-hoc network (MANET) and its applications
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel Trick
 
Leach protocol
Leach protocolLeach protocol
Leach protocol
 
Load balancing in cloud computing.pptx
Load balancing in cloud computing.pptxLoad balancing in cloud computing.pptx
Load balancing in cloud computing.pptx
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Cloud computing ppts
Cloud computing pptsCloud computing ppts
Cloud computing ppts
 
Building Topology in NS3
Building Topology in NS3Building Topology in NS3
Building Topology in NS3
 
Distributed storage system
Distributed storage systemDistributed storage system
Distributed storage system
 
~Ns2~
~Ns2~~Ns2~
~Ns2~
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Wireless sensor network and its application
Wireless sensor network and its applicationWireless sensor network and its application
Wireless sensor network and its application
 
Birch1
Birch1Birch1
Birch1
 
Fisheye State Routing (FSR) - Protocol Overview
Fisheye State Routing (FSR) - Protocol OverviewFisheye State Routing (FSR) - Protocol Overview
Fisheye State Routing (FSR) - Protocol Overview
 

Similar to Representation Learning on Complex Graphs

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processingjins0618
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataFabion Kauker
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...miyurud
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworkstm1966
 
The Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsThe Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsNeo4j
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationVijaylaxmiNagurkar
 
Graph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsGraph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsWQ Fan
 
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...Sushant Gautam
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Banditslauratoni4
 
Skyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentSkyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentIJMER
 
Lens-based Focus+Context Visualization Techniques
Lens-based Focus+Context Visualization TechniquesLens-based Focus+Context Visualization Techniques
Lens-based Focus+Context Visualization TechniquesMatthias Trapp
 

Similar to Representation Learning on Complex Graphs (20)

Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processing
 
PointNet
PointNetPointNet
PointNet
 
Euro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street dataEuro30 2019 - Benchmarking tree approaches on street data
Euro30 2019 - Benchmarking tree approaches on street data
 
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
The Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing SystemsThe Future is Big Graphs: A Community View on Graph Processing Systems
The Future is Big Graphs: A Community View on Graph Processing Systems
 
Portfolio
PortfolioPortfolio
Portfolio
 
MapReduce Algorithm Design
MapReduce Algorithm DesignMapReduce Algorithm Design
MapReduce Algorithm Design
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentation
 
Graph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsGraph Neural Networks for Recommendations
Graph Neural Networks for Recommendations
 
Visual Network Narrations
Visual Network NarrationsVisual Network Narrations
Visual Network Narrations
 
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...On Integrating Information Visualization Techniques into Data Mining: A Revie...
On Integrating Information Visualization Techniques into Data Mining: A Revie...
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
 
Skyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed EnvironmentSkyline Query Processing using Filtering in Distributed Environment
Skyline Query Processing using Filtering in Distributed Environment
 
Lens-based Focus+Context Visualization Techniques
Lens-based Focus+Context Visualization TechniquesLens-based Focus+Context Visualization Techniques
Lens-based Focus+Context Visualization Techniques
 

More from eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
 

More from eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Hasler2014
Hasler2014Hasler2014
Hasler2014
 

Recently uploaded

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationDianaGray10
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationBuild Intuit
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxatharvdev2010
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerAnchore
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 

Recently uploaded (20)

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automation
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientation
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey Hightower
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 

Representation Learning on Complex Graphs

  • 1. Representation Learning on Graphs with Complex Structures Prof. Dr. Philippe Cudré-Mauroux eXascale Infolab, U. of Fribourg–Switzerland DL4G-SDE @ WWW2019 San Francisco, May 13, 2019
  • 2. Representation Learning on Graphs ■ Projecting nodes of a graph onto a vector space while preserving key structural properties of the graph (e.g., topological proximity of the nodes) 8/5/192 WWW2019@San Francisco Neural embedding techniques (e.g.word2vec) … 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31 1 Perozzi, Bryan, Rami Al-Rfou, and Steven Skiena. "Deepwalk: Online learning of social representations." In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710. ACM, 2014. DeepWalk1
  • 3. 8/5/193 WWW2019@San Francisco What if the graph at hand exhibits a much more complex structure?
  • 4. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/194 WWW2019@San Francisco
  • 5. Heterogeneous Graphs ■ Heterogeneous Graphs contain multiple node types: ● Homogeneous edges: linking nodes from the same domain ● Heterogeneous edges: linking nodes across different domains 8/5/195 WWW2019@San Francisco
  • 6. Meta-Paths in Heterogeneous Graphs ■ A meta-path is a sequence of node types encoding key composite relations among the involved node types. ■ Meta-paths are used to guide random walks to redefine the neighborhood of a node. 8/5/196 WWW2019@San Francisco 1 Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144. Metapath2vec1 Neural embedding techniques (e.g.word2vec) … 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31
  • 7. Challenges with Meta-Paths ■ The choice of meta-paths highly affects the quality of the learnt node embeddings for a specific task. ■ How to select meta-paths ? ● Graph specific and highly depends on prior knowledge from domain experts. ● Strategies to combine a set of meta-paths can be complex and computationally expensive. 8/5/197 WWW2019@San Francisco
  • 8. Are meta-paths necessary? 8/5/198 WWW2019@San Francisco
  • 9. JUST: Embedding Heterogeneous Graphs without Meta-Paths ■ Random Walk with JUmp and STay strategies to probabilistically control the random walk. ■ 2 ways to balance the random walk: ● Step I: Jump or stay? −Objective: Balance the number of heterogeneous and homogeneous edges traversed during random walks (stay with probability 𝝰, exponential decay). ● Step II: If Jump, where to Jump? −Objective: Control the randomness in choosing a target domain (memory window to favor diversity). ■ Learn node embeddings with SkipGram model. 8/5/199 WWW2019@San Francisco
  • 10. Results 8/5/1910 WWW2019@San Francisco JUST achieves state-of-the-art performance without using meta-paths. Node classification results
  • 11. Runtime Performance ■ End-to-end node embedding learning time for all random-walk based methods in seconds. 8/5/1911 WWW2019@San Francisco DBLP Movie Foursquare DeepWalk 236 333 484 Metapath2vec (original) 965 19,200 2,248 Metapath2vec (ours) 290 408 550 Hin2vec 904 1,301 1,801 JUST 310 442 616 • Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time, but achieves better results in classification and clustering tasks. • Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better results in most experiments.
  • 12. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/1912 WWW2019@San Francisco
  • 13. Social Relationships v.s. Human Mobility 8/5/1913 WWW2019@San Francisco
  • 14. 8/5/1914 WWW2019@San Francisco How to quantify the impact of social relationships and mobility on each other?
  • 15. ● Two types of links −Friendships −Check-ins (Hyperedges) Location Based Social Networks ■A hypergraph with ● Four data domains 8/5/1915 WWW2019@San Francisco Spatial - POI Temporal - Time slot Semantic - Activity category Social - User
  • 16. Hypergraph Embedding 8/5/1916 WWW2019@San Francisco 0.19 0.32 1.89 1.21 0.87 0.67 0.45 1.76 1.42 0.98 1.32 0.77 1.11 1.29 1.31 045 0.89 1.56 0.02 0.79 … Graph embedding Neural embedding techniques (e.g. SkipGram) 1. How to sample from a LBSN hypergraph? 2. How to preserve n-wise proximity from Hyperedges?
  • 17. 1. Sample from A Hypergraph: Random Walk with Stay ■ Balancing the impact of social and mobility on the learnt embeddings 8/5/1917 WWW2019@San Francisco Sample and learn from • A check-in hyperedge with probability 𝛼 • A user-user pair with probability (1-𝛼)
  • 18. 2. Learn from Hyperedges: Learning via Best-Fit-Line ■ Maximizing the similarity between the nodes of a hyperedge and their best-fit-line under cosine similarity. 8/5/1918 WWW2019@San Francisco 1. Compute the best-fit-line 2. Maximize the cosine similarity between each node and the best-fit-line
  • 19. Task I: Friendship Prediction ■ Comparison with other graph embedding techniques ● (S) Social network only ● (S&M) Social and mobility through clique expansion 8/5/1919 WWW2019@San Francisco ↑ 32.95% on precision@10 Clique expansion
  • 20. Task II: Location Prediction ■ Comparison with other graph embedding techniques ● (M) Mobility (Check-in) network only ● (S&M) Social and mobility through clique expansion 8/5/1920 WWW2019@San Francisco ↑ 25.32% on accuracy@10
  • 21. 8/5/19 WWW2019@San Francisco21 Balancing the Impact of Social Relationships and Mobility Matters! Asymmetric impact of mobility and social relationships on predicting each other: • Friendship prediction: 80% social and 20% mobility data • Location prediction: 60% social and 40% mobility data
  • 22. Outlines ■ JUST: Embedding heterogeneous graphs without meta-paths [CIKM’18] ■ LBSN2Vec: Embedding heterogeneous hypergraphs from LBSNs [WWW’19] ■ NodeSketch: Highly-efficient graph embeddings via recursive sketching [KDD’19] 8/5/1922 WWW2019@San Francisco
  • 23. Graph Embeddings ■ Graph-sampling based techniques ● Sample node pairs from a graph, and preserve node proximity from the node pairs ● Examples: DeepWalk, Node2Vec, LINE, SDNE and VERSE, etc. ● Efficiency bottleneck: A large number of node pairs -> significant computation resources (CPU time) ■ Factorization based techniques ● Factorize a (transformed, e.g., high-order) proximity/adjacency matrix of a graph ● Examples: GraRep, HOPE and NetMF, etc. ● Efficiency bottleneck: Large matrix factorization -> significant computation resources (both CPU time and RAM) ■ Node proximity preserved using cosine similarity ● Efficiency bottleneck: cosine similarity is less efficient than hamming similarity, for example. 8/5/1923 WWW2019@San Francisco
  • 24. Similarity-Preserving Hashing/Sketching ■ Efficient similarity approximation of high dimensional data ● Data-dependent hashing (learning-to-hash) −Learning dataset-specific hashing functions −Examples: spectral hashing, iterative quantization, etc. −Efficient in similarity computation, but requires learning hashing functions ● Data-independent hashing/sketching (locality sensitive hashing) −Hashing without involving any learning process from data −Examples: minhash, consistent weighted sampling, etc. −Efficient in both similarity approximation and hashing 8/5/1924 WWW2019@San Francisco
  • 25. Can we sketch nodes in a graph as embeddings? 8/5/1925 WWW2019@San Francisco
  • 26. Preliminary: Consistent Weighted Sampling1 ■ Principled techniques for highly-efficient similarity approximation 8/5/1926 WWW2019@San Francisco The min-max similarity between original data Can be approximated by the Hamming similarity between sketches 1.32 2.77 1.11 3.29 1.31V Sketch S = S1 … Sj … SL D=5 Random hash function hj , j=1…,L. 1 Dingqi Yang, Bin Li, Rettig Laura, Philippe Cudré-Mauroux, D2HistoSketch: Discriminative and Dynamic Similarity-Preserving Sketching of Streaming Histograms, IEEE Transactions on Knowledge and Data Engineering (TKDE) 2018
  • 27. Sketching the Adjacency Matrix ? ■ Adjacency matrix v.s. Self-Loop-Augmented (SLA) adjacency matrix 8/5/1927 WWW2019@San Francisco
  • 28. NodeSketch: Low-Order Node Embeddings 8/5/1928 WWW2019@San Francisco 1 2 3 4 5
  • 29. NodeSketch: High-Order Node Embeddings 8/5/1929 WWW2019@San Francisco 1 1 0.33 0.33 0.33 Neighbors 𝒏 ∈ 𝜞 𝒓 Node 2 2 3 1 SLA adjacency vector '𝑽 𝒓 Sketch element distribution 𝟏 𝑳 ∑𝒋-𝟏 𝑳 𝕝[𝑺 𝒋 𝒏 𝒌2𝟏 -𝒊], 𝑖=1,..,D 1.066 1.066 0.066 Approximate 𝑘-order SLA adjacency vector '𝑽 𝒓 (𝒌) node 1 Sketching using Eq. 3 *Weight α=0.2 Merge 1 1 1 1 1 1 1 1 1 1 1 1 1 SLA adjacency matrix '𝑨 2 1 1 2 3 1 2 3 4 4 3 4 5 3 5 (𝑘-1)-order node embeddings 𝑺(𝒌 − 𝟏) 𝑘-order embeddings 𝑺(𝒌) 2 1 3 2 3 4 2 3 4 2 3 4 4 3 5 (𝑘-1)-order Sketches 𝑺 𝒏 (𝒌 − 𝟏) … … … Uniformity of the generated samples: The foundation of our recursive sketching process 1 2 3 4 5
  • 30. Results: Node Classification Performance using Kernel SVM 8/5/1930 WWW2019@San Francisco Classical graph embedding techniques (preserving cosine similarity) Learning-to-hash techniques Sketching techniques NodeSketch shows comparable performance to the best-performing state-of-the-art techniques.
  • 31. Results: Runtime Performance 8/5/1931 WWW2019@San Francisco NodeSketch is highly-efficient, and significantly outperforms all baselines, showing 9x-273x speedup. Hamming similarity also shows improved efficiency (1.19x- 1.68x speedup) over cosine similarity.
  • 32. Take-Away Messages ■ JUST: Meta-path free heterogeneous graph embedding can achieve state- of-the-art performance efficiently. [CIKM’18] ■ LBSN2Vec: Asymmetric impact of social and mobility on each other [WWW’19] ■ NodeSketch: High-quality node embeddings can be generated via highly- efficient sketching techniques [KDD’19] 8/5/1932 WWW2019@San Francisco [CIKM’18] Hussein, Rana, Dingqi Yang, and Philippe Cudré-Mauroux. "Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings." CIKM’18. [WWW’19] Dingqi Yang, Bingqing Qu, Jie Yang, Philippe Cudre-Mauroux, ”Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach.” WWW’19. [KDD’19] Dingqi Yang, Paolo Rosso, Bin Li and Philippe Cudre-Mauroux, “NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching.” KDD’19.
  • 33. Future Plan for Representation Learning on Graphs ■ Attributed graph structure (e.g., property graphs) ■ Heterogeneous data structures (e.g., structured knowledge graph + unstructured text) ■ Dynamic graphs (e.g., streaming LBSN graphs) 4/29/19 Dingqi's job talk @ University of Luxembourg33