SlideShare a Scribd company logo
1 of 32
MINING AND MODELING
NETWORK DATA
SIAM DM'18 1
SIAM DM 2018
June 7, 2018
Denver, Colorado, USA
Act 1.
09:30–09:55 New Perspectives on Measuring Network Clustering
10:00–10:25 Hypergraph Kronecker Models for Networks
10:30–10:55 Mitigating Overexposure in Viral Marketing
11:00–11:25 Modeling and Mining Dynamic Competition Networks
Act 2.
2:45-3:10 Graph Matching Via Low Rank Factors
3:15-3:40 Tuning the Activity of Neural Networks at Criticality
3:45-4:10 Detectability of Hierarchical Community Structure in Preprocessed Multilayer Networks
4:15-4:40 Evaluating Overfit and Underfit in Models of Network Community Structure
Davis and Leinhardt. The structure of positive interpersonal
relations in small groups. Sociological Theories in Progress,
1971.
New perspectives on
measuring network
clustering
Austin R. Benson · Cornell
SIAM DM'18Benson 2
Joint work with
Hao Yin · Stanford
Jure Leskovec · Stanford
Johan Ugander · Stanford
slides ⟶ bit.ly/arb-DM-18 code ⟶ github.com/arbenson/HigherOrderClustering.jl
Many networks are globally sparse but locally
dense.
SIAM DM'18Benson 3
Coauthorship network
Brain network
Sporns and
Bullmore, Nature
Rev. Neuro., 2012
Networks for real-world systems have modules, clusters, communities.
[Watts-Strogatz 98; Flake 00; Newman 04, 06; many others…]
The clustering coefficient is a fundamental
measure in network science about how much a
network clusters.
SIAM DM'18Benson 4
?
C(u) = fraction of length-2 paths centered at node u
that form a triangle.
Average clustering coefficient C = mean of C(u).
• Data insights. Average clustering coefficient is larger than we would expect.
[Watts-Strogatz 98] > 36k citations!
• Domain phenomenon. Triadic closure in sociology.
[Simmel 1908; Rapoport 53; Granovetter 73]
• Statistical Feature. Role discovery, anomaly detection, mental health study.
[Henderson+ 12; La Fond+ 14, 16; Bearman-Moody 2004]
• Modeling tool. Key property for generative models.
[Newman 09; Seshadhri-Kolda-Pinar 12; Roble+ 16]
-
This talk introduces two new classes of network
clustering measures with an eye towards data
mining.
SIAM DM'18Benson 5
1. Higher-order clustering coefficients.
The clustering coefficient measures the closure
probability of just one simple structure—the triangle.
We will show that triangles are sufficient to explain clustering
in only some networks. We need larger cliques.
There is evidence in the literature that this should be true…
• 4-cliques reveal community structure in word association and PPI networks [Palla+ 05]
• 4-/5-cliques (+ other structure) identify network type & dimension [Yaveroğlu+ 14, Bonato+ 14]
• 4-node motifs identify community structure in neural systems [Benson-Gleich-Leskovec 16]
This talk introduces two new classes of network
clustering measures with an eye towards data
mining.
SIAM DM'18Benson 6
2. Closure coefficients.
Why do we measure clustering from the center node?
?
We will show that
• large closure coefficients theoretically imply local community structure
• closure coefficients with directed edges expose hierarchy
The well-known proverb
a friend of my friend is my friend
suggests a different way of measuring clustering.
?
Part I. Higher-order clustering coefficients
SIAM DM'18Benson 7
Yin, Benson, and Leskovec.
Higher-order clustering in networks.
Physical Review E, 2018.
github.com/arbenson/HigherOrderClustering.jl
1. Find a 2-clique 2. Attach adjacent edge 3. Check for (2+1)-clique
1. Find a 3-clique 2. Attach adjacent edge 3. Check for (3+1)-clique
1. Find a 4-clique 2. Attach adjacent edge 3. Check for (4+1)-
clique
8
C2 = avg. fraction of (2-clique, adjacent edge) pairs that induce a (2+1)-clique.
Increase clique size by 1 to get a higher-order clustering coefficient!
C3 = avg. fraction of (3-clique, adjacent edge) pairs that induce a (3+1)-clique.
C4 = avg. fraction of (4-clique, adjacent edge) pairs that induce a (4+1)-clique.
-
-
-
We view clustering as a clique expansion process.
SIAM DM'18Benson
9
We can think of higher-order closure processes in
everyday life.
SIAM DM'18Benson
Alice
Bob
Charlie
1. Start with a group
of 3 friends.
2. One person in the
group befriends
someone new.
3.The group might
increase in size.
Dave
rollingstone.com
oprah.com
10
Higher-order clustering coefficients offer
several advantages.
SIAM DM'18Benson
Theory & analysis.
• small-world and Gn,p random graph models.
• Extremal combinatorics for general graphs.
Data Insights.
• old idea ⟶ pretty much all real-world networks exhibit clustering.
• new idea ⟶ real-world networks may only cluster up to a certain order.
order.
11
Background.
Local, average, and global clustering coefficients.
SIAM DM'18Benson
Second-order (classical)
local clustering
coefficient at node u.
Second-order (classical)
global clustering coefficient.
Second-order (classical)
average clustering
coefficient.
#
#
#
#
#
#
12
Higher-order (third-order)
local, average, and global clustering coefficients.
SIAM DM'18Benson
Third-order
local clustering
coefficient at node u.
Third-order
global clustering coefficient.
Third-order
average clustering
coefficient.
#
#
#
#
#
#
Theorem [Watts-Strogatz 98]
13
We can analyze higher-order clustering with
small-world models.
SIAM DM'18Benson
• Start with n nodes and edges to 2k neighbors
and then rewire each edge with probability p.
n = 16
k = 3
p = 0
[Yin-Benson-Leskovec 18]
[Watts-Strogatz 98]
14
We can also analyze higher-order clustering in
Gn,p.
SIAM DM'18Benson
Theorem [Yin-Benson-Leskovec 2017]
Everything scales exponentially in the order of the cluster coefficient...
Even if a node’s neighborhood is dense, i.e., C2(u) is large,
higher-order clustering still decays exponentially in Gn,p.
15
Extremal combinatorics show relationships
between clustering coefficients of different orders.
SIAM DM'18Benson
Theorem [Yin-Benson-Leskovec 18]
16
Neural connections (C. elegans)
297 nodes
2.15k edges
Facebook friendships (Stanford3)
11.6k nodes
568k edges
Coauthorships (arXiv ca-AstroPh)
18.8k nodes
198k edges
http://www.wormatlas.org/hermaphrodite/
neuronalsupport/mainframe.htm
SIAM DM'18Benson
Global clustering patterns varies widely across
datasets.
SIAM DM'18Benson 17
Neural connections 0.18 0.08 0.06 decreases with order
Facebook friendships 0.16 0.11 0.12 decreases and increases
Coauthorships 0.32 0.33 0.36 increases with order
Not obviously due to cliques in coauthorship!
High-degree nodes in co-authorships exhibit
clique + star structure where C3(u) > C2(u).
Average higher-order clustering also varies widely.
SIAM DM'18Benson 18
Neural connections 0.31 0.14
Random configurations 0.15 0.04
Random configurations (C2 fixed). 0.31 0.17
Facebook friendships 0.25 0.18
Random configurations 0.03 0.00
Random configurations (C2 fixed) 0.25 0.14
Coauthorships 0.68 0.61
Random configurations 0.01 0.00
Random configurations (C2 fixed). 0.68 0.60-
-
-
statistically
significantly
less
clustering
statistically
significantly
more clustering
Not significantly
different
clustering
(using sampling tools from [Bollobás 1980; Milo+ 03; Park-Newman 04; Colomer de Simón+ 13])
SIAM DM'18Benson 19
Local higher-order clustering gives a more nuanced
view.
Neural connections
Gn,p baseline
Upper bound
Facebook friendships Coauthorships
Dense but nearly
random regions
Dense and
structured regions
• Actual network data
• Random configuration with C2 fixed
-
Hitting upper
bound
SIAM DM'18Benson 20
• old idea ⟶ pretty much all real-world networks exhibit
clustering.
• new idea ⟶ networks may only cluster up to a certain order.
Part II. Closure coefficients.
SIAM DM'18Benson 21
Yin, Benson, and Leskovec.
The Local Closure Coefficient.
Submitted, 2018.
Yin, Ugander, and Benson.
Directed Closure Coefficients.
In preparation, 2018.
We typically measure clustering from the center of
a wedge, but we could just as well measure from
the head.
SIAM DM'18Benson 22
Clustering coefficient.
• A common friend provides a
friendship opportunity.
• C(u) = fraction of neighbor pairs
pairs that are connected.
Closure coefficient.
• A friend of my friend provides a
friendship opportunity.
• H(u) = fraction of neighbor pairs
that are connected.
There is no universal correlation between
clustering and closure.
SIAM DM'18Benson 23
Closure coefficients tend to increase with degree
while clustering coefficients tend to decrease with
degree.
SIAM DM'18Benson 24
Theorem [Yin-Benson-Leskovec]
degree
degree
Large closure coefficients imply existence of
communities.
SIAM DM'18Benson 25
(edges leaving S)
(edge end points in S)
Conductance is one of the most important cluster quality scores [Schaeffer 07]
u
Theorem [Yin-Benson-Leskovec]
where N(u) is the 1-hop neighborhood
of u.
Directed closure coefficients offer additional
insights.
SIAM DM'18Benson 26
• There are 8 analogous clustering coefficients, too.
• We’ll see that closure coefficients are more useful features than
clustering coefficients for some supervised prediction problems.
Closure coefficients help detect social hierarchy.
SIAM DM'18Benson 27
Davis and Leinhardt. The structure of
positive interpersonal relations in small
groups. Sociological Theories in Progress,
1971.
• Corporate law advice network [Lazega 01]
• Nodes are lawyers; ~50/50 associates/partners
• Edges represent advice
i → j if lawyer i goes to lawyer j for advice
To whom did you go for basic professional advice? For instance, you
want to make sure that you are handling a case right, making a proper
decision, and you want to consult someone whose professional opinions
are in general of great value to you. By advice I do not mean simply
technical advice.
• We used closure coefficients in a supervised learning setting to
predict seniority (if a lawyer is a partner).
Closure coefficients help detect social hierarchy.
SIAM DM'18Benson 28
Degree. 79%
Clustering. 64%
Degree + Clustering. 79%
Closure. 87%
Degree + Closure. 87%
• Lasso regression (L1-regularized linear regression).
• Features are in/out degree, clustering coeffs., closure coeffs.
Regularization level
Closure coefficients are good features for
classifying fish vs. non-fish in food webs.
SIAM DM'18Benson 29
Bascompte, Melián, and Sala. Interaction
strength combinations and the overfishing
of a marine food web. PNAS, 2005.
• Florida Bay food web [Ulanowicz+ 98]
• Nodes that represent species
• Edges represent carbon exchange
i → j if j consumes i
• Same experiments as for lawyer network,
but for predicting if a node is a fish.
Closure coefficients help detect fish in food webs.
SIAM DM'18Benson 30
Degree. 63%
Clustering. 70%
Degree + Clustering. 74%
Closure. 88%
Degree + Closure. 88%
• Lasso regression (L1-regularized linear regression).
• Features are in/out degree, clustering coeffs., closure coeffs.
Regularization level
We should keep various cluster measures in mind
when mining and modeling network data.
SIAM DM'18Benson 31
1. Higher-order clustering coefficients and closure coefficients offer
additional measures of network clustering.
→We should plug these features into ML pipelines for network data.
2. Only using triangles gives a misleading notion of clustering.
Some networks do not even exhibit clustering w/r/t larger cliques!
→ Are there models that capture higher-order clustering statistics?
3. Measuring clustering from the center of a wedge is also misleading.
Measuring from the head actually connects clustering to communities!
→ Are there models that capture closure statistics?
New perspectives on measuring network clustering.
Thanks for your attention!
SLDS/NS'18Benson 32
Austin R. Benson
http://cs.cornell.edu/~arb
@austinbenson
arb@cs.cornell.edu
Yin, Benson, and Leskovec. Higher-order clustering in networks. Physical Review E, 2018.
→ github.com/arbenson/HigherOrderClustering.jl
Yin, Benson, and Leskovec.The LocalClosure Coefficient. Submitted, 2018.
Yin, Ugander, and Benson. Directed Closure Coefficients. In preparation, 2018.

More Related Content

Similar to New perspectives on measuring network clustering

Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficientsAustin Benson
 
Self-organization of society: fragmentation, disagreement, and how to overcom...
Self-organization of society: fragmentation, disagreement, and how to overcom...Self-organization of society: fragmentation, disagreement, and how to overcom...
Self-organization of society: fragmentation, disagreement, and how to overcom...Hiroki Sayama
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisAustin Benson
 
Community detection
Community detectionCommunity detection
Community detectionScott Pauls
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionAustin Benson
 
Network sampling, community detection
Network sampling, community detectionNetwork sampling, community detection
Network sampling, community detectionroberval mariano
 
Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisDavid Gleich
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)dnac
 
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...csandit
 
UQBS Seminar - Innovation Networks
UQBS Seminar - Innovation NetworksUQBS Seminar - Innovation Networks
UQBS Seminar - Innovation NetworksTim Kastelle
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115Divita Madaan
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelEswar Publications
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExAustin Benson
 
Distribution of maximal clique size of the
Distribution of maximal clique size of theDistribution of maximal clique size of the
Distribution of maximal clique size of theIJCNCJournal
 

Similar to New perspectives on measuring network clustering (20)

Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficients
 
Self-organization of society: fragmentation, disagreement, and how to overcom...
Self-organization of society: fragmentation, disagreement, and how to overcom...Self-organization of society: fragmentation, disagreement, and how to overcom...
Self-organization of society: fragmentation, disagreement, and how to overcom...
 
Computational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data AnalysisComputational Frameworks for Higher-order Network Data Analysis
Computational Frameworks for Higher-order Network Data Analysis
 
Community detection
Community detectionCommunity detection
Community detection
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Network sampling, community detection
Network sampling, community detectionNetwork sampling, community detection
Network sampling, community detection
 
ilp-nlp-slides.pdf
ilp-nlp-slides.pdfilp-nlp-slides.pdf
ilp-nlp-slides.pdf
 
Engineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network AnalysisEngineering Data Science Objectives for Social Network Analysis
Engineering Data Science Objectives for Social Network Analysis
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
SAC TRECK 2008
SAC TRECK 2008SAC TRECK 2008
SAC TRECK 2008
 
08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)08 Exponential Random Graph Models (2016)
08 Exponential Random Graph Models (2016)
 
08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)08 Exponential Random Graph Models (ERGM)
08 Exponential Random Graph Models (ERGM)
 
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
 
UQBS Seminar - Innovation Networks
UQBS Seminar - Innovation NetworksUQBS Seminar - Innovation Networks
UQBS Seminar - Innovation Networks
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
 
01 Network Data Collection (2017)
01 Network Data Collection (2017)01 Network Data Collection (2017)
01 Network Data Collection (2017)
 
Distribution of maximal clique size of the
Distribution of maximal clique size of theDistribution of maximal clique size of the
Distribution of maximal clique size of the
 
SSRI_pt1.ppt
SSRI_pt1.pptSSRI_pt1.ppt
SSRI_pt1.ppt
 

More from Austin Benson

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Austin Benson
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networksAustin Benson
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingAustin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsAustin Benson
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsAustin Benson
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link predictionAustin Benson
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesAustin Benson
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flowsAustin Benson
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graphAustin Benson
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureAustin Benson
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseAustin Benson
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structureAustin Benson
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Austin Benson
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionAustin Benson
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsAustin Benson
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifsAustin Benson
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three waysAustin Benson
 
Sequences of Sets KDD '18
Sequences of Sets KDD '18Sequences of Sets KDD '18
Sequences of Sets KDD '18Austin Benson
 
Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Austin Benson
 
Tensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic ProcessesTensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic ProcessesAustin Benson
 

More from Austin Benson (20)

Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)Hypergraph Cuts with General Splitting Functions (JMM)
Hypergraph Cuts with General Splitting Functions (JMM)
 
Spectral embeddings and evolving networks
Spectral embeddings and evolving networksSpectral embeddings and evolving networks
Spectral embeddings and evolving networks
 
Higher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modelingHigher-order link prediction and other hypergraph modeling
Higher-order link prediction and other hypergraph modeling
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting FunctionsHypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
 
Higher-order link prediction
Higher-order link predictionHigher-order link prediction
Higher-order link prediction
 
Three hypergraph eigenvector centralities
Three hypergraph eigenvector centralitiesThree hypergraph eigenvector centralities
Three hypergraph eigenvector centralities
 
Semi-supervised learning of edge flows
Semi-supervised learning of edge flowsSemi-supervised learning of edge flows
Semi-supervised learning of edge flows
 
Choosing to grow a graph
Choosing to grow a graphChoosing to grow a graph
Choosing to grow a graph
 
Link prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structureLink prediction in networks with core-fringe structure
Link prediction in networks with core-fringe structure
 
Higher-order Link Prediction Syracuse
Higher-order Link Prediction SyracuseHigher-order Link Prediction Syracuse
Higher-order Link Prediction Syracuse
 
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structureRandom spatial network models for core-periphery structure
Random spatial network models for core-periphery structure
 
Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.Random spatial network models for core-periphery structure.
Random spatial network models for core-periphery structure.
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
Simplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusionsSimplicial closure and simplicial diffusions
Simplicial closure and simplicial diffusions
 
Sampling methods for counting temporal motifs
Sampling methods for counting temporal motifsSampling methods for counting temporal motifs
Sampling methods for counting temporal motifs
 
Set prediction three ways
Set prediction three waysSet prediction three ways
Set prediction three ways
 
Sequences of Sets KDD '18
Sequences of Sets KDD '18Sequences of Sets KDD '18
Sequences of Sets KDD '18
 
Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18Simplicial closure and higher-order link prediction --- SIAMNS18
Simplicial closure and higher-order link prediction --- SIAMNS18
 
Tensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic ProcessesTensor Eigenvectors and Stochastic Processes
Tensor Eigenvectors and Stochastic Processes
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

New perspectives on measuring network clustering

  • 1. MINING AND MODELING NETWORK DATA SIAM DM'18 1 SIAM DM 2018 June 7, 2018 Denver, Colorado, USA Act 1. 09:30–09:55 New Perspectives on Measuring Network Clustering 10:00–10:25 Hypergraph Kronecker Models for Networks 10:30–10:55 Mitigating Overexposure in Viral Marketing 11:00–11:25 Modeling and Mining Dynamic Competition Networks Act 2. 2:45-3:10 Graph Matching Via Low Rank Factors 3:15-3:40 Tuning the Activity of Neural Networks at Criticality 3:45-4:10 Detectability of Hierarchical Community Structure in Preprocessed Multilayer Networks 4:15-4:40 Evaluating Overfit and Underfit in Models of Network Community Structure Davis and Leinhardt. The structure of positive interpersonal relations in small groups. Sociological Theories in Progress, 1971.
  • 2. New perspectives on measuring network clustering Austin R. Benson · Cornell SIAM DM'18Benson 2 Joint work with Hao Yin · Stanford Jure Leskovec · Stanford Johan Ugander · Stanford slides ⟶ bit.ly/arb-DM-18 code ⟶ github.com/arbenson/HigherOrderClustering.jl
  • 3. Many networks are globally sparse but locally dense. SIAM DM'18Benson 3 Coauthorship network Brain network Sporns and Bullmore, Nature Rev. Neuro., 2012 Networks for real-world systems have modules, clusters, communities. [Watts-Strogatz 98; Flake 00; Newman 04, 06; many others…]
  • 4. The clustering coefficient is a fundamental measure in network science about how much a network clusters. SIAM DM'18Benson 4 ? C(u) = fraction of length-2 paths centered at node u that form a triangle. Average clustering coefficient C = mean of C(u). • Data insights. Average clustering coefficient is larger than we would expect. [Watts-Strogatz 98] > 36k citations! • Domain phenomenon. Triadic closure in sociology. [Simmel 1908; Rapoport 53; Granovetter 73] • Statistical Feature. Role discovery, anomaly detection, mental health study. [Henderson+ 12; La Fond+ 14, 16; Bearman-Moody 2004] • Modeling tool. Key property for generative models. [Newman 09; Seshadhri-Kolda-Pinar 12; Roble+ 16] -
  • 5. This talk introduces two new classes of network clustering measures with an eye towards data mining. SIAM DM'18Benson 5 1. Higher-order clustering coefficients. The clustering coefficient measures the closure probability of just one simple structure—the triangle. We will show that triangles are sufficient to explain clustering in only some networks. We need larger cliques. There is evidence in the literature that this should be true… • 4-cliques reveal community structure in word association and PPI networks [Palla+ 05] • 4-/5-cliques (+ other structure) identify network type & dimension [Yaveroğlu+ 14, Bonato+ 14] • 4-node motifs identify community structure in neural systems [Benson-Gleich-Leskovec 16]
  • 6. This talk introduces two new classes of network clustering measures with an eye towards data mining. SIAM DM'18Benson 6 2. Closure coefficients. Why do we measure clustering from the center node? ? We will show that • large closure coefficients theoretically imply local community structure • closure coefficients with directed edges expose hierarchy The well-known proverb a friend of my friend is my friend suggests a different way of measuring clustering. ?
  • 7. Part I. Higher-order clustering coefficients SIAM DM'18Benson 7 Yin, Benson, and Leskovec. Higher-order clustering in networks. Physical Review E, 2018. github.com/arbenson/HigherOrderClustering.jl
  • 8. 1. Find a 2-clique 2. Attach adjacent edge 3. Check for (2+1)-clique 1. Find a 3-clique 2. Attach adjacent edge 3. Check for (3+1)-clique 1. Find a 4-clique 2. Attach adjacent edge 3. Check for (4+1)- clique 8 C2 = avg. fraction of (2-clique, adjacent edge) pairs that induce a (2+1)-clique. Increase clique size by 1 to get a higher-order clustering coefficient! C3 = avg. fraction of (3-clique, adjacent edge) pairs that induce a (3+1)-clique. C4 = avg. fraction of (4-clique, adjacent edge) pairs that induce a (4+1)-clique. - - - We view clustering as a clique expansion process. SIAM DM'18Benson
  • 9. 9 We can think of higher-order closure processes in everyday life. SIAM DM'18Benson Alice Bob Charlie 1. Start with a group of 3 friends. 2. One person in the group befriends someone new. 3.The group might increase in size. Dave rollingstone.com oprah.com
  • 10. 10 Higher-order clustering coefficients offer several advantages. SIAM DM'18Benson Theory & analysis. • small-world and Gn,p random graph models. • Extremal combinatorics for general graphs. Data Insights. • old idea ⟶ pretty much all real-world networks exhibit clustering. • new idea ⟶ real-world networks may only cluster up to a certain order. order.
  • 11. 11 Background. Local, average, and global clustering coefficients. SIAM DM'18Benson Second-order (classical) local clustering coefficient at node u. Second-order (classical) global clustering coefficient. Second-order (classical) average clustering coefficient. # # # # # #
  • 12. 12 Higher-order (third-order) local, average, and global clustering coefficients. SIAM DM'18Benson Third-order local clustering coefficient at node u. Third-order global clustering coefficient. Third-order average clustering coefficient. # # # # # #
  • 13. Theorem [Watts-Strogatz 98] 13 We can analyze higher-order clustering with small-world models. SIAM DM'18Benson • Start with n nodes and edges to 2k neighbors and then rewire each edge with probability p. n = 16 k = 3 p = 0 [Yin-Benson-Leskovec 18] [Watts-Strogatz 98]
  • 14. 14 We can also analyze higher-order clustering in Gn,p. SIAM DM'18Benson Theorem [Yin-Benson-Leskovec 2017] Everything scales exponentially in the order of the cluster coefficient... Even if a node’s neighborhood is dense, i.e., C2(u) is large, higher-order clustering still decays exponentially in Gn,p.
  • 15. 15 Extremal combinatorics show relationships between clustering coefficients of different orders. SIAM DM'18Benson Theorem [Yin-Benson-Leskovec 18]
  • 16. 16 Neural connections (C. elegans) 297 nodes 2.15k edges Facebook friendships (Stanford3) 11.6k nodes 568k edges Coauthorships (arXiv ca-AstroPh) 18.8k nodes 198k edges http://www.wormatlas.org/hermaphrodite/ neuronalsupport/mainframe.htm SIAM DM'18Benson
  • 17. Global clustering patterns varies widely across datasets. SIAM DM'18Benson 17 Neural connections 0.18 0.08 0.06 decreases with order Facebook friendships 0.16 0.11 0.12 decreases and increases Coauthorships 0.32 0.33 0.36 increases with order Not obviously due to cliques in coauthorship! High-degree nodes in co-authorships exhibit clique + star structure where C3(u) > C2(u).
  • 18. Average higher-order clustering also varies widely. SIAM DM'18Benson 18 Neural connections 0.31 0.14 Random configurations 0.15 0.04 Random configurations (C2 fixed). 0.31 0.17 Facebook friendships 0.25 0.18 Random configurations 0.03 0.00 Random configurations (C2 fixed) 0.25 0.14 Coauthorships 0.68 0.61 Random configurations 0.01 0.00 Random configurations (C2 fixed). 0.68 0.60- - - statistically significantly less clustering statistically significantly more clustering Not significantly different clustering (using sampling tools from [Bollobás 1980; Milo+ 03; Park-Newman 04; Colomer de Simón+ 13])
  • 19. SIAM DM'18Benson 19 Local higher-order clustering gives a more nuanced view. Neural connections Gn,p baseline Upper bound Facebook friendships Coauthorships Dense but nearly random regions Dense and structured regions • Actual network data • Random configuration with C2 fixed - Hitting upper bound
  • 20. SIAM DM'18Benson 20 • old idea ⟶ pretty much all real-world networks exhibit clustering. • new idea ⟶ networks may only cluster up to a certain order.
  • 21. Part II. Closure coefficients. SIAM DM'18Benson 21 Yin, Benson, and Leskovec. The Local Closure Coefficient. Submitted, 2018. Yin, Ugander, and Benson. Directed Closure Coefficients. In preparation, 2018.
  • 22. We typically measure clustering from the center of a wedge, but we could just as well measure from the head. SIAM DM'18Benson 22 Clustering coefficient. • A common friend provides a friendship opportunity. • C(u) = fraction of neighbor pairs pairs that are connected. Closure coefficient. • A friend of my friend provides a friendship opportunity. • H(u) = fraction of neighbor pairs that are connected.
  • 23. There is no universal correlation between clustering and closure. SIAM DM'18Benson 23
  • 24. Closure coefficients tend to increase with degree while clustering coefficients tend to decrease with degree. SIAM DM'18Benson 24 Theorem [Yin-Benson-Leskovec] degree degree
  • 25. Large closure coefficients imply existence of communities. SIAM DM'18Benson 25 (edges leaving S) (edge end points in S) Conductance is one of the most important cluster quality scores [Schaeffer 07] u Theorem [Yin-Benson-Leskovec] where N(u) is the 1-hop neighborhood of u.
  • 26. Directed closure coefficients offer additional insights. SIAM DM'18Benson 26 • There are 8 analogous clustering coefficients, too. • We’ll see that closure coefficients are more useful features than clustering coefficients for some supervised prediction problems.
  • 27. Closure coefficients help detect social hierarchy. SIAM DM'18Benson 27 Davis and Leinhardt. The structure of positive interpersonal relations in small groups. Sociological Theories in Progress, 1971. • Corporate law advice network [Lazega 01] • Nodes are lawyers; ~50/50 associates/partners • Edges represent advice i → j if lawyer i goes to lawyer j for advice To whom did you go for basic professional advice? For instance, you want to make sure that you are handling a case right, making a proper decision, and you want to consult someone whose professional opinions are in general of great value to you. By advice I do not mean simply technical advice. • We used closure coefficients in a supervised learning setting to predict seniority (if a lawyer is a partner).
  • 28. Closure coefficients help detect social hierarchy. SIAM DM'18Benson 28 Degree. 79% Clustering. 64% Degree + Clustering. 79% Closure. 87% Degree + Closure. 87% • Lasso regression (L1-regularized linear regression). • Features are in/out degree, clustering coeffs., closure coeffs. Regularization level
  • 29. Closure coefficients are good features for classifying fish vs. non-fish in food webs. SIAM DM'18Benson 29 Bascompte, Melián, and Sala. Interaction strength combinations and the overfishing of a marine food web. PNAS, 2005. • Florida Bay food web [Ulanowicz+ 98] • Nodes that represent species • Edges represent carbon exchange i → j if j consumes i • Same experiments as for lawyer network, but for predicting if a node is a fish.
  • 30. Closure coefficients help detect fish in food webs. SIAM DM'18Benson 30 Degree. 63% Clustering. 70% Degree + Clustering. 74% Closure. 88% Degree + Closure. 88% • Lasso regression (L1-regularized linear regression). • Features are in/out degree, clustering coeffs., closure coeffs. Regularization level
  • 31. We should keep various cluster measures in mind when mining and modeling network data. SIAM DM'18Benson 31 1. Higher-order clustering coefficients and closure coefficients offer additional measures of network clustering. →We should plug these features into ML pipelines for network data. 2. Only using triangles gives a misleading notion of clustering. Some networks do not even exhibit clustering w/r/t larger cliques! → Are there models that capture higher-order clustering statistics? 3. Measuring clustering from the center of a wedge is also misleading. Measuring from the head actually connects clustering to communities! → Are there models that capture closure statistics?
  • 32. New perspectives on measuring network clustering. Thanks for your attention! SLDS/NS'18Benson 32 Austin R. Benson http://cs.cornell.edu/~arb @austinbenson arb@cs.cornell.edu Yin, Benson, and Leskovec. Higher-order clustering in networks. Physical Review E, 2018. → github.com/arbenson/HigherOrderClustering.jl Yin, Benson, and Leskovec.The LocalClosure Coefficient. Submitted, 2018. Yin, Ugander, and Benson. Directed Closure Coefficients. In preparation, 2018.