SlideShare a Scribd company logo
1 of 37
Download to read offline
Higher-order graph clustering
Austin Benson
Stanford University
AMS Spring Western Sectional
Pullman, WA
April 22, 2017
Joint work with
David Gleich, Purdue
Jure Leskovec, Stanford
A C
B
A C
B 2
Unifying two themes of network analysis
Co-author network
1. Real-world networks have
modular organization.
[Newman04, Newman06]
2. Higher-order connectivity patterns, or
network motifs, drive complex systems.
[Milo+02, Yaveroğlu+14].
Edge-based clustering and
community detection sometimes
expose this structure.
Signed feed-forward loops in
genetic transcription.
[Mangan+03, Alon07]
Triangles in social networks.
[Rapoport53, Granovetter73]
Bi-directed length-2 paths in
brain networks.
[Sporns-Kötter04, Sporns+07]
A CA C
3
Motifs are the fundamental units of
complex networks.
We should look for structure
(clusters) in terms of them.
Network
Motif
Different motifs give different clusters.
4
Higher-order graph clustering is our technique
for finding clusters based on motifs
5
§ We will generalize spectral clustering, a classical technique to find
clusters or communities in a graph, to use motifs as the fundamental
unit to partition.
§ Based on a higher-order (motif-based) conductance metric that
generalizes the traditional conductance.
§ Comes with theoretical guarantees.
§ We’ll first briefly review how spectral clustering works.
§ Then we’ll see how to adapt it to work with network motifs.
§ Then we’ll see the impact of this approach on various real-world data.
Higher-order graph clustering
Main points and overview
Graph Laplacian Eigenvector
6
Background spectral clustering is a classic technique to
partition graphs by looking at eigenvectors
Cluster
[Fiedler73]
Conductance is one of the most important cluster quality scores [Schaeffer07]
used in Markov chain theory, spectral clustering, bioinformatics, vision, etc.
The conductance of a set of vertices S is the ratio of
edges leaving to total edges
small conductance ó good cluster
(edges leaving S)
(edge end points in S)
7
S
cut(S) = 7
vol(S) = 85
vol( ¯S) = 151
ffi(S) = 7/85
Background spectral clustering works based on conductance
S
(S) =
cut(S)
min(vol(S), vol(S))
Cheeger Inequality
Finding the best conductance set is NP-hard. L
§ Cheeger realized the eigenvalues of the Laplacian
provided surface area to volume bounds in manifolds.
§ Alon and Milman independently realized the same
thing for a graph (conductance)!
Laplacian
2
⇤/2  2  2 ⇤
0 = 1  2  ...  n  2
⇤ = set of smallest conductance
8
Background spectral clustering has theoretical guarantees
[Cheeger70, Alon-Milman85]
D = diag(Ae)
L = D−1/2
(A − D)D−1/2
Eigenvalues of the Laplacian L
We can find a set S that achieves the Cheeger
bound.
1. Compute the eigenvector z associated with λ2
and scale to f = D-1/2z
2. Sort the vertices by their values in f:
σ1, σ2, …, σn
3. Let Sr = {σ1, …, σr} and compute the
conductance of φ(Sr) of each Sr.
4. Pick the set Sm with minimum conductance.
9
Background the sweep cut algorithm realizes the guarantee
[Mihail89, Chung92]
0 20 40
0
0.2
0.4
0.6
0.8
1
S
i
φi
(Sm) 2
Sr
0 20 40
0
0.2
0.4
0.6
0.8
1
Si
φ
i
10
Background the sweep cut visualized
[Mihail89, Chung92]
Sr
(S) =
cut(S)
min(vol(S), vol(S))
We want to cluster with richer data
Motifs that may be directed, signed, colored, feature-valued, etc.
11
Spectral clustering is theoretically justified for finding
edge-based clusters in undirected, simple graphs.
Signed feed-forward loops in genetic
transcription [Mangan+03]
Gene X activates transcription in gene Y.
Gene X suppresses transcription in gene Z.
Gene Y suppresses transcription in gene Z.X
Y
Z
Benson-Gleich-Leskovec, Science, 2016.
12
Our contributions
§ A generalized conductance metric for motifs.
§ A new spectral clustering algorithm to minimize
the generalized conductance.
§ AND an associated motif Cheeger inequality
guarantee.
§ Naturally handles directed, signed, colored,
weighted, and combinations of motifs.
§ Scales to networks with billions of edges.
§ Applications in ecology, biology, and
transportation.
A
B
wijk
13
How do we find clusters
based on motifs?
Motif-based conductance
Need new notions of cut and volume
14
φM (S) =
cutM (S)
min(volM (S), volM (S))
(S) =
cut(S)
min(vol(S), vol(S))
vol(S) = #(edge end points in S)
cut(S) = #(edges cut) cutM (S) = #(motifs cut)
volM (S) = #(motif end points in S)
S
S
S
S S
M = triangle motif
Motif-based conductance
Motif M
M (S) =
motifs cut
motif volume
=
1
8
15
9
10
3
5
6
7
8
1
2
4
9
10
3
5
6
7
8
1
2
4
S
S
Problem Given a motif M and a graph G, we want to
find a set of nodes S that minimizes motif conductance
This is NP-hard. [Wagner-Wagner93]
Our solution Generalize spectral clustering for motifs
1. Form new weighted, undirected graph W(M) based on M and G
2. Compute Fiedler vector of Laplacian matrix of W(M) [Fiedler73, Alon-Milman85]
3. Use “sweep cut” procedure to output clusters [Mihail89, Chung92]
Theorem (motif Cheeger inequality)
resulting clusters will obtain near optimal motif conductance
16
Higher-order clustering
Motif-based spectral clustering
1
2
3
4
5
6
7
8
9
10
1
11
1
1
1
3
1
1
1
1
1
2
1
1
17
9
10
3
5
6
7
8
1
2
4
W(M)
ij = #{instances of motif M that contain nodes i and j}
motif M
Step 1. Given directed graph G and motif M, form a weighted graph W(M).
graph G
weighted graph W(M)
Key insight
Classical spectral clustering on
weighted graph W(M) finds clusters
of low motif conductance.
M (S) =
motifs cut
motif volume
=
1
10
1
2
3
4
5
6
7
8
9
10
1
11
1
1
1
3
1
1
1
1
1
2
1
1
18
Motif-based spectral clustering
Step 1. Given directed graph G and motif M, form a weighted graph W(M).
motif M
W(M)
ij = #{instances of motif M that contain nodes i and j}
W(M)
Motif-based spectral clustering
19
Step 2. Compute the eigenvector f(M) associated with λ2 of the
normalized Laplacian matrix of W(M)
1
2
3
4
5
6
7
8
9
10
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Takes roughly O(# edges) time.
D = diag(W(M)
e)
L(M)
= D−1/2
(D − W(M)
)D−1/2
L(M)
z = –2z
f (M)
= D−1/2
z
f(M)
Motif-based spectral clustering
20
Step 3 (motif sweep cut) [Mihail89,Chung92]
§ Sort nodes by values in f(M) → σ1, σ2, …σn.
§ Pick set Sr = {σ1, …, σr} with smallest motif conductance.
9
10
3
5
6
7
8
1
2
4
Motif Cheeger inequality
Theorem If the motif has three
nodes, then the sweep procedure
on the weighted graph finds a set
S of nodes for which
For 4+ nodes, need slightly different
notion of conductance.
21
M(G) = {instances of M in G}
Key Proof Step
cutM (S, G) =
X
{i,j,k}2M(G)
Indicator[xi , xj , xk not the same]
= 1
4 (x2
i + x2
j + x2
k xi xj xj xk xi xk )
= quadratic in x
M (S) 2 M
Applications
22
1. We do not know the motif of interest.
food webs and new applications
2. We know the motif of interest from domain knowledge.
yeast transcription regulation networks, connectome, social networks
3. We seek richer information from our data.
transportation networks and new applications
23
Application 1
We do not know the motif of interest.
Application 1 Food webs
Florida bay food web
§ Nodes are species
§ Edges represent carbon exchange
i → j if j eats i
§ Motifs represent energy flow patterns
24
http://marinebio.org/oceans/marine-zones/
M6M5
Application 1 Food webs
Which motif clusters the food web?
Our approach
§ Run motif spectral clustering for all 3-node
motifs as well as for just edges.
§ Examine the sweep profile to see which motif
gives the best clusters.
25
Application 1 Food webs
26
M6
M5
edge
Our finding motif M6
organizes the food web
into good clusters
M6M5
edge
B D
Micronutrient
sources
Pelagic fishes
and benthic
prey
Benthic macro-
invertebrates
Benthic Fishes
Motif M6 reveals
aquatic layers
A
61% accuracy vs.
48% with edge-
based methods
27
Application 1 Food webs
28
Application 2
We know the motif of interest
from domain knowledge.
Application 2 Yeast transcription regulation networks
§ Nodes are groups of genes
§ Edge i → j means i regulates transcription to j
§ Sign + / - denotes activation / suppression
§ Coherent feedforward loops encode biological function
[Mangan+03, Alon07]
A CC A
29
Clustering based on coherent
feedforward loops identifies
functions studied individually
by biologists [Mangan+03]
97% accuracy vs.
68–82% with edge-based methods
A
B
A CA C
A
B
30
Application 2 Yeast transcription regulation networks
A
B
31
Structure of the found modules
(all edge signs are positive)
Application 2 Yeast transcription regulation networks
32
Application 3
We seek richer information
from our data.
Application 3 Transportation networks
§ North American air
transport network.
§ Nodes are cites.
§ i → j if you can travel
from i to j in < 8 hours.
[Frey-Dueck07]
33
Accepted pending m
	
B
A
Important motifs from literature
[Rosvall+2014]
W
(M)
ij = #{bi-directional length-2 paths from i to j}
Weighted adjacency matrix already reveals hub-like structure
34
Application 3 Transportation networks
Top 8
U.S. hubs
East coast non-hubs
West coast non-hubs
Primary spectral coordinate
Atlanta, the top hub, is
next to Salina, a non-hub.
MOTIF SPECTRAL EMBEDDING EDGE SPECTRAL EMBEDDING
35
Application 3 Transportation networks
36
§Generalization of graph clustering to higher-order structures
(motifs) through a new objective (motif conductance).
§Generalizing old ideas from spectral graph theory admits a new
algorithm and a motif Cheeger inequality.
§Applications in ecology, biology, transportation.
Recap Higher-order graph clustering
# form W0 … W4
sc0 = spectral_cut(W0)
sc1 = spectral_cut(W1)
sc2 = spectral_cut(W2)
sc3 = spectral_cut(W3)
sc4 = spectral_cut(W4)
plt = x -> semilogx(
x.sweepcut_profile.
conductance)
plt(sc0)
plt(sc1)
plt(sc2)
plt(sc3)
plt(sc4)
Find me at…
http://stanford.edu/~arbenson
@austinbenson
arbenson@stanford.edu
Thanks!
Austin Benson
Stanford University
For fun reproduce some results from this talk!
Jupyter notebooks at bit.ly/higher-order-julia
Higher-order graph clustering
Paper Benson-Gleich-Leskovec, Higher-order organization of complex networks, Science, 2016
Code + data http://snap.stanford.edu/higher-order

More Related Content

What's hot

Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...Yandex
 
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKSA NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKSijaia
 
Deep learning ensembles loss landscape
Deep learning ensembles loss landscapeDeep learning ensembles loss landscape
Deep learning ensembles loss landscapeDevansh16
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Yan Xu
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksmourya chandra
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clusteringishmecse13
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)CSCJournals
 
Finding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By QuadtreeFinding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By Quadtreeiosrjce
 
Hierarchical clustering techniques
Hierarchical clustering techniquesHierarchical clustering techniques
Hierarchical clustering techniquesMd Syed Ahamad
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithmparry prabhu
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics Kolja Kleineberg
 

What's hot (20)

CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
CLIM Program: Remote Sensing Workshop, Multilayer Modeling and Analysis of Co...
 
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
Limits of Local Algorithms for Randomly Generated Constraint Satisfaction Pro...
 
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKSA NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
A NEW GENERALIZATION OF EDGE OVERLAP TO WEIGHTED NETWORKS
 
CSMR11b.ppt
CSMR11b.pptCSMR11b.ppt
CSMR11b.ppt
 
Deep learning ensembles loss landscape
Deep learning ensembles loss landscapeDeep learning ensembles loss landscape
Deep learning ensembles loss landscape
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
Fuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networksFuzzy c means clustering protocol for wireless sensor networks
Fuzzy c means clustering protocol for wireless sensor networks
 
08 clustering
08 clustering08 clustering
08 clustering
 
26 spanning
26 spanning26 spanning
26 spanning
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)International Journal of Computer Science and Security Volume (2) Issue (5)
International Journal of Computer Science and Security Volume (2) Issue (5)
 
Finding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By QuadtreeFinding Neighbors in Images Represented By Quadtree
Finding Neighbors in Images Represented By Quadtree
 
Hierarchical clustering techniques
Hierarchical clustering techniquesHierarchical clustering techniques
Hierarchical clustering techniques
 
Bistablecamnets
BistablecamnetsBistablecamnets
Bistablecamnets
 
H010223640
H010223640H010223640
H010223640
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
 
Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
 

Similar to Higher-order graph clustering at AMS Spring Western Sectional

Higher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsHigher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsAustin Benson
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...IOSRJECE
 
Suppression of grating lobes
Suppression of grating lobesSuppression of grating lobes
Suppression of grating lobeskavindrakrishna
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral ClusteringAustin Benson
 
Minimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsMinimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsLuis Galárraga
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsSciCompIIT
 
Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Austin Benson
 
3 d active meshes for cell tracking
3 d active meshes for cell tracking3 d active meshes for cell tracking
3 d active meshes for cell trackingPrashant Pal
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanationsGopi Saiteja
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482IJRAT
 

Similar to Higher-order graph clustering at AMS Spring Western Sectional (20)

Higher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsHigher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifs
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
 
Suppression of grating lobes
Suppression of grating lobesSuppression of grating lobes
Suppression of grating lobes
 
Tensor Spectral Clustering
Tensor Spectral ClusteringTensor Spectral Clustering
Tensor Spectral Clustering
 
Minimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applicationsMinimizing cost in distributed multiquery processing applications
Minimizing cost in distributed multiquery processing applications
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular Dynamics
 
Csmr11b.ppt
Csmr11b.pptCsmr11b.ppt
Csmr11b.ppt
 
Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)Learning multifractal structure in large networks (Purdue ML Seminar)
Learning multifractal structure in large networks (Purdue ML Seminar)
 
2009 asilomar
2009 asilomar2009 asilomar
2009 asilomar
 
3 d active meshes for cell tracking
3 d active meshes for cell tracking3 d active meshes for cell tracking
3 d active meshes for cell tracking
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Answers withexplanations
Answers withexplanationsAnswers withexplanations
Answers withexplanations
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482
 

More from Austin Benson

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Austin Benson
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networksAustin Benson
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisAustin Benson
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingAustin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsAustin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsAustin Benson
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link predictionAustin Benson
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionAustin Benson
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesAustin Benson
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flowsAustin Benson
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graphAustin Benson
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureAustin Benson
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExAustin Benson
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseAustin Benson
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structureAustin Benson
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Austin Benson
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionAustin Benson
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsAustin Benson
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifsAustin Benson
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three waysAustin Benson
 

More from Austin Benson (20)

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networks
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data Analysis
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modeling
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link prediction
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralities
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flows
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graph
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structure
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structure
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusions
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifs
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three ways
 

Recently uploaded

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 

Recently uploaded (20)

CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Higher-order graph clustering at AMS Spring Western Sectional

  • 1. Higher-order graph clustering Austin Benson Stanford University AMS Spring Western Sectional Pullman, WA April 22, 2017 Joint work with David Gleich, Purdue Jure Leskovec, Stanford
  • 2. A C B A C B 2 Unifying two themes of network analysis Co-author network 1. Real-world networks have modular organization. [Newman04, Newman06] 2. Higher-order connectivity patterns, or network motifs, drive complex systems. [Milo+02, Yaveroğlu+14]. Edge-based clustering and community detection sometimes expose this structure. Signed feed-forward loops in genetic transcription. [Mangan+03, Alon07] Triangles in social networks. [Rapoport53, Granovetter73] Bi-directed length-2 paths in brain networks. [Sporns-Kötter04, Sporns+07] A CA C
  • 3. 3 Motifs are the fundamental units of complex networks. We should look for structure (clusters) in terms of them.
  • 4. Network Motif Different motifs give different clusters. 4 Higher-order graph clustering is our technique for finding clusters based on motifs
  • 5. 5 § We will generalize spectral clustering, a classical technique to find clusters or communities in a graph, to use motifs as the fundamental unit to partition. § Based on a higher-order (motif-based) conductance metric that generalizes the traditional conductance. § Comes with theoretical guarantees. § We’ll first briefly review how spectral clustering works. § Then we’ll see how to adapt it to work with network motifs. § Then we’ll see the impact of this approach on various real-world data. Higher-order graph clustering Main points and overview
  • 6. Graph Laplacian Eigenvector 6 Background spectral clustering is a classic technique to partition graphs by looking at eigenvectors Cluster [Fiedler73]
  • 7. Conductance is one of the most important cluster quality scores [Schaeffer07] used in Markov chain theory, spectral clustering, bioinformatics, vision, etc. The conductance of a set of vertices S is the ratio of edges leaving to total edges small conductance ó good cluster (edges leaving S) (edge end points in S) 7 S cut(S) = 7 vol(S) = 85 vol( ¯S) = 151 ffi(S) = 7/85 Background spectral clustering works based on conductance S (S) = cut(S) min(vol(S), vol(S))
  • 8. Cheeger Inequality Finding the best conductance set is NP-hard. L § Cheeger realized the eigenvalues of the Laplacian provided surface area to volume bounds in manifolds. § Alon and Milman independently realized the same thing for a graph (conductance)! Laplacian 2 ⇤/2  2  2 ⇤ 0 = 1  2  ...  n  2 ⇤ = set of smallest conductance 8 Background spectral clustering has theoretical guarantees [Cheeger70, Alon-Milman85] D = diag(Ae) L = D−1/2 (A − D)D−1/2 Eigenvalues of the Laplacian L
  • 9. We can find a set S that achieves the Cheeger bound. 1. Compute the eigenvector z associated with λ2 and scale to f = D-1/2z 2. Sort the vertices by their values in f: σ1, σ2, …, σn 3. Let Sr = {σ1, …, σr} and compute the conductance of φ(Sr) of each Sr. 4. Pick the set Sm with minimum conductance. 9 Background the sweep cut algorithm realizes the guarantee [Mihail89, Chung92] 0 20 40 0 0.2 0.4 0.6 0.8 1 S i φi (Sm) 2 Sr
  • 10. 0 20 40 0 0.2 0.4 0.6 0.8 1 Si φ i 10 Background the sweep cut visualized [Mihail89, Chung92] Sr (S) = cut(S) min(vol(S), vol(S))
  • 11. We want to cluster with richer data Motifs that may be directed, signed, colored, feature-valued, etc. 11 Spectral clustering is theoretically justified for finding edge-based clusters in undirected, simple graphs. Signed feed-forward loops in genetic transcription [Mangan+03] Gene X activates transcription in gene Y. Gene X suppresses transcription in gene Z. Gene Y suppresses transcription in gene Z.X Y Z
  • 12. Benson-Gleich-Leskovec, Science, 2016. 12 Our contributions § A generalized conductance metric for motifs. § A new spectral clustering algorithm to minimize the generalized conductance. § AND an associated motif Cheeger inequality guarantee. § Naturally handles directed, signed, colored, weighted, and combinations of motifs. § Scales to networks with billions of edges. § Applications in ecology, biology, and transportation. A B wijk
  • 13. 13 How do we find clusters based on motifs?
  • 14. Motif-based conductance Need new notions of cut and volume 14 φM (S) = cutM (S) min(volM (S), volM (S)) (S) = cut(S) min(vol(S), vol(S)) vol(S) = #(edge end points in S) cut(S) = #(edges cut) cutM (S) = #(motifs cut) volM (S) = #(motif end points in S) S S S S S M = triangle motif
  • 15. Motif-based conductance Motif M M (S) = motifs cut motif volume = 1 8 15 9 10 3 5 6 7 8 1 2 4 9 10 3 5 6 7 8 1 2 4 S S
  • 16. Problem Given a motif M and a graph G, we want to find a set of nodes S that minimizes motif conductance This is NP-hard. [Wagner-Wagner93] Our solution Generalize spectral clustering for motifs 1. Form new weighted, undirected graph W(M) based on M and G 2. Compute Fiedler vector of Laplacian matrix of W(M) [Fiedler73, Alon-Milman85] 3. Use “sweep cut” procedure to output clusters [Mihail89, Chung92] Theorem (motif Cheeger inequality) resulting clusters will obtain near optimal motif conductance 16 Higher-order clustering
  • 17. Motif-based spectral clustering 1 2 3 4 5 6 7 8 9 10 1 11 1 1 1 3 1 1 1 1 1 2 1 1 17 9 10 3 5 6 7 8 1 2 4 W(M) ij = #{instances of motif M that contain nodes i and j} motif M Step 1. Given directed graph G and motif M, form a weighted graph W(M). graph G weighted graph W(M)
  • 18. Key insight Classical spectral clustering on weighted graph W(M) finds clusters of low motif conductance. M (S) = motifs cut motif volume = 1 10 1 2 3 4 5 6 7 8 9 10 1 11 1 1 1 3 1 1 1 1 1 2 1 1 18 Motif-based spectral clustering Step 1. Given directed graph G and motif M, form a weighted graph W(M). motif M W(M) ij = #{instances of motif M that contain nodes i and j} W(M)
  • 19. Motif-based spectral clustering 19 Step 2. Compute the eigenvector f(M) associated with λ2 of the normalized Laplacian matrix of W(M) 1 2 3 4 5 6 7 8 9 10 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 Takes roughly O(# edges) time. D = diag(W(M) e) L(M) = D−1/2 (D − W(M) )D−1/2 L(M) z = –2z f (M) = D−1/2 z f(M)
  • 20. Motif-based spectral clustering 20 Step 3 (motif sweep cut) [Mihail89,Chung92] § Sort nodes by values in f(M) → σ1, σ2, …σn. § Pick set Sr = {σ1, …, σr} with smallest motif conductance. 9 10 3 5 6 7 8 1 2 4
  • 21. Motif Cheeger inequality Theorem If the motif has three nodes, then the sweep procedure on the weighted graph finds a set S of nodes for which For 4+ nodes, need slightly different notion of conductance. 21 M(G) = {instances of M in G} Key Proof Step cutM (S, G) = X {i,j,k}2M(G) Indicator[xi , xj , xk not the same] = 1 4 (x2 i + x2 j + x2 k xi xj xj xk xi xk ) = quadratic in x M (S) 2 M
  • 22. Applications 22 1. We do not know the motif of interest. food webs and new applications 2. We know the motif of interest from domain knowledge. yeast transcription regulation networks, connectome, social networks 3. We seek richer information from our data. transportation networks and new applications
  • 23. 23 Application 1 We do not know the motif of interest.
  • 24. Application 1 Food webs Florida bay food web § Nodes are species § Edges represent carbon exchange i → j if j eats i § Motifs represent energy flow patterns 24 http://marinebio.org/oceans/marine-zones/ M6M5
  • 25. Application 1 Food webs Which motif clusters the food web? Our approach § Run motif spectral clustering for all 3-node motifs as well as for just edges. § Examine the sweep profile to see which motif gives the best clusters. 25
  • 26. Application 1 Food webs 26 M6 M5 edge Our finding motif M6 organizes the food web into good clusters M6M5 edge
  • 27. B D Micronutrient sources Pelagic fishes and benthic prey Benthic macro- invertebrates Benthic Fishes Motif M6 reveals aquatic layers A 61% accuracy vs. 48% with edge- based methods 27 Application 1 Food webs
  • 28. 28 Application 2 We know the motif of interest from domain knowledge.
  • 29. Application 2 Yeast transcription regulation networks § Nodes are groups of genes § Edge i → j means i regulates transcription to j § Sign + / - denotes activation / suppression § Coherent feedforward loops encode biological function [Mangan+03, Alon07] A CC A 29
  • 30. Clustering based on coherent feedforward loops identifies functions studied individually by biologists [Mangan+03] 97% accuracy vs. 68–82% with edge-based methods A B A CA C A B 30 Application 2 Yeast transcription regulation networks
  • 31. A B 31 Structure of the found modules (all edge signs are positive) Application 2 Yeast transcription regulation networks
  • 32. 32 Application 3 We seek richer information from our data.
  • 33. Application 3 Transportation networks § North American air transport network. § Nodes are cites. § i → j if you can travel from i to j in < 8 hours. [Frey-Dueck07] 33
  • 34. Accepted pending m B A Important motifs from literature [Rosvall+2014] W (M) ij = #{bi-directional length-2 paths from i to j} Weighted adjacency matrix already reveals hub-like structure 34 Application 3 Transportation networks
  • 35. Top 8 U.S. hubs East coast non-hubs West coast non-hubs Primary spectral coordinate Atlanta, the top hub, is next to Salina, a non-hub. MOTIF SPECTRAL EMBEDDING EDGE SPECTRAL EMBEDDING 35 Application 3 Transportation networks
  • 36. 36 §Generalization of graph clustering to higher-order structures (motifs) through a new objective (motif conductance). §Generalizing old ideas from spectral graph theory admits a new algorithm and a motif Cheeger inequality. §Applications in ecology, biology, transportation. Recap Higher-order graph clustering
  • 37. # form W0 … W4 sc0 = spectral_cut(W0) sc1 = spectral_cut(W1) sc2 = spectral_cut(W2) sc3 = spectral_cut(W3) sc4 = spectral_cut(W4) plt = x -> semilogx( x.sweepcut_profile. conductance) plt(sc0) plt(sc1) plt(sc2) plt(sc3) plt(sc4) Find me at… http://stanford.edu/~arbenson @austinbenson arbenson@stanford.edu Thanks! Austin Benson Stanford University For fun reproduce some results from this talk! Jupyter notebooks at bit.ly/higher-order-julia Higher-order graph clustering Paper Benson-Gleich-Leskovec, Higher-order organization of complex networks, Science, 2016 Code + data http://snap.stanford.edu/higher-order