Presentation slides for the paper 'Structural Patterns and Generative Models of Real-world Hypergraphs'. Published in KDD2020 - ACM SIGKDD International Conference on Knowedge Discovery and Data Mining
5. • Hypergraphs: not straightforward to analyze
o complex representation
o lack tools
5/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Only interactions
at the level of
nodes
1
2 3
4 5
7
6 1
2 3
4
7
5
6
Motivation for a New Tool
Motivation Structural Patterns GeneratorsDecomposition Conclusion
• Projection
o information loss
o no high-order level information
10. • 13 datasets from 6 domains
◦ Email: recipient addresses of an email
◦ Drug components: classes or substances within a single drug, listed
in the National Drug Code Directory
◦ Drug use: drugs used by a patient, reported to the Drug Abuse
Warning Network, before an emergency visit
◦ Online tags: tags in a question in Stack Exchange forums
◦ Online threads: users answering a question in Stack Exchange forums
◦ Coauthorship: coauthors of a publications
10/26Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Motivation Structural Patterns GeneratorsDecomposition Conclusion
Real-word Datasets
11. 11/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Structural Patterns
P1. Degree distribution: heavy-tailed
P2. Connected component: giant
P3. Clustering coefficient: high
P4. Effective diameter: small
P5. Singular value distribution: heavy-tailed
Motivation Structural Patterns GeneratorsDecomposition Conclusion
12. 12/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
P1+P5. Heavy-tailed Distributions
Abundant low-
degree nodes
A few high-
degree nodes
Degree Singular values
Motivation Structural Patterns GeneratorsDecomposition Conclusion
Degree and singular-value distributions are heavy-tailed
J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
13. 13/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Motivation Structural Patterns GeneratorsDecomposition Conclusion
Statistical tests: to confirm heavy-tailed distributions
Lilliefors Test(1)
• 𝐻0: distribution is exponential
• 𝐻1: distribution is not exponential
𝐻0 rejected at 2.5% significance level
Log likelihood ratio(2)
𝑟 = 𝑙𝑜𝑔
𝐿1
𝐿0
• 𝐿1 : likelihood of a heavy-tailed
distribution (power-law, log
normal)
• 𝐿0 : likelihood of the exponential
distribution
If 𝑟 > 0 : the distribution is more
likely to be heavy-tailed.
P1+P5. Heavy-tailed Distributions
(1) Hubert W Lilliefors. 1969. On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown. Journal of American Statististical
Association 64 (1969), 387–389.
(2) Jeff Alstott and Dietmar Plenz Bullmore. 2014. powerlaw: a Python package for analysis of heavy-tailed distributions. PloS one 9, 1 (2014)
14. 14/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
P2. Giant Connected Component
A large proportion of nodes are connected
Connected components
Proportionofnodes
Motivation Structural Patterns GeneratorsDecomposition Conclusion
J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
15. 15/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
P3. High Clustering Coefficient
Local clustering coefficient:
𝐶𝑖 =
2 ∗ 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑖
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑒𝑑𝑔𝑒𝑠 𝑎𝑡 𝑖
Clustering coefficient:
𝐶 =
1
|𝑉|
𝑖∈𝑉
𝐶𝑖
High likelihood of having links between “friends of friends”
Motivation Structural Patterns GeneratorsDecomposition Conclusion
Wedge at 𝑖: open triangle
𝑖
16. 16/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
P4. Small Effective Diameter
d = 8
90%
of
pairs
Most pairs of connected nodes: reachable within a small distance
Motivation Structural Patterns GeneratorsDecomposition Conclusion
https://web.stanford.edu/class/cs224w/handouts/02-gnp-smallworld.pdf
17. 17/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Structural Patterns: Intuition
Real-world
graphs
P1. Degree distribution: heavy-tailed
P2. Connected component: giant
P3. Clustering coefficient: high
P4. Effective diameter: small
P5. Singular value distribution: heavy-tailed
Hypergraph Decomposed
graphs
All decomposition
levels
Real-world graph
Decomposition
Motivation Structural Patterns GeneratorsDecomposition Conclusion
J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos. 2005. Graphs over Time:
Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
20. 20/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
Structural Patterns
Giant connected components vary among datasets
Motivation Structural Patterns GeneratorsDecomposition Conclusion
If there is a giant connected component
• High Clustering Coefficient
• Small Effective Diameter
Proportion of nodes in the
largest connected component
Small numbers indicate the
absence of a giant connected
component