•0 likes•864 views

Download to read offline

Report

Presentation slides for the paper 'Structural Patterns and Generative Models of Real-world Hypergraphs'. Published in KDD2020 - ACM SIGKDD International Conference on Knowedge Discovery and Data Mining

- 1. Structural Patterns and Generative Models of Real-world Hypergraphs Manh Tuan Do, Se-eun Yoon, Bryan Hooi, Kijung Shin 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2020 Contact: Manh Tuan Do (manh.it97@kaist.ac.kr)
- 2. Roadmap 1. Introduction << 2. Decomposition 3. Structural Patterns 4. Generators 5. Conclusions 2/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
- 3. 3/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Coauthorship Email Tags Common patterns Underlying mechanisms Motivation Structural Patterns GeneratorsDecomposition Conclusion Hypergraphs
- 4. Roadmap 1. Motivation 2. Decomposition << 3. Structural Patterns 4. Generators 5. Conclusions 4/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
- 5. • Hypergraphs: not straightforward to analyze o complex representation o lack tools 5/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Only interactions at the level of nodes 1 2 3 4 5 7 6 1 2 3 4 7 5 6 Motivation for a New Tool Motivation Structural Patterns GeneratorsDecomposition Conclusion • Projection o information loss o no high-order level information
- 6. • Multi-level decomposition: ◦ representation by pair-wise graphs ◦ leveraging existing tools & measurements ◦ no information loss: original hypergraph is reconstructible 6/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Tool: Decomposition Motivation Structural Patterns GeneratorsDecomposition Conclusion 1 7 2 42 3 1 4 3 4 1 21 3 5 6 4 5 3 5 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 7 5 6
- 7. 7/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) 1 2 3 4 5 7 6 {1, 2, 3, 4} {3, 4, 5} {5, 6} {1, 7} 1 2 3 4 7 5 6 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 Node level Triangle level 4clique level Our Tool: Decomposition 2 42 3 1 4 3 4 1 21 31 7 5 6 4 5 3 5 Edge level Motivation Structural Patterns GeneratorsDecomposition Conclusion
- 8. 8/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) {1, 2, 3, 4} {3, 4, 5} {5, 6} {1, 7} Maximum hyperedge size: 𝑁 Our Tool: Decomposition Hypergraph Decomposed graphs Node-level Edge-level ….. (𝑁-1)-clique level Decomposition Reconstruction 1 7 2 42 3 1 4 3 4 1 21 3 5 6 4 5 3 5 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 7 5 6 1 2 3 4 5 7 6 Motivation Structural Patterns GeneratorsDecomposition Conclusion
- 9. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns << 4. Generators 5. Conclusions 9/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
- 10. • 13 datasets from 6 domains ◦ Email: recipient addresses of an email ◦ Drug components: classes or substances within a single drug, listed in the National Drug Code Directory ◦ Drug use: drugs used by a patient, reported to the Drug Abuse Warning Network, before an emergency visit ◦ Online tags: tags in a question in Stack Exchange forums ◦ Online threads: users answering a question in Stack Exchange forums ◦ Coauthorship: coauthors of a publications 10/26Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Motivation Structural Patterns GeneratorsDecomposition Conclusion Real-word Datasets
- 11. 11/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular value distribution: heavy-tailed Motivation Structural Patterns GeneratorsDecomposition Conclusion
- 12. 12/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P1+P5. Heavy-tailed Distributions Abundant low- degree nodes A few high- degree nodes Degree Singular values Motivation Structural Patterns GeneratorsDecomposition Conclusion Degree and singular-value distributions are heavy-tailed J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
- 13. 13/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Motivation Structural Patterns GeneratorsDecomposition Conclusion Statistical tests: to confirm heavy-tailed distributions Lilliefors Test(1) • 𝐻0: distribution is exponential • 𝐻1: distribution is not exponential 𝐻0 rejected at 2.5% significance level Log likelihood ratio(2) 𝑟 = 𝑙𝑜𝑔 𝐿1 𝐿0 • 𝐿1 : likelihood of a heavy-tailed distribution (power-law, log normal) • 𝐿0 : likelihood of the exponential distribution If 𝑟 > 0 : the distribution is more likely to be heavy-tailed. P1+P5. Heavy-tailed Distributions (1) Hubert W Lilliefors. 1969. On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown. Journal of American Statististical Association 64 (1969), 387–389. (2) Jeff Alstott and Dietmar Plenz Bullmore. 2014. powerlaw: a Python package for analysis of heavy-tailed distributions. PloS one 9, 1 (2014)
- 14. 14/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P2. Giant Connected Component A large proportion of nodes are connected Connected components Proportionofnodes Motivation Structural Patterns GeneratorsDecomposition Conclusion J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
- 15. 15/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P3. High Clustering Coefficient Local clustering coefficient: 𝐶𝑖 = 2 ∗ 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑖 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑒𝑑𝑔𝑒𝑠 𝑎𝑡 𝑖 Clustering coefficient: 𝐶 = 1 |𝑉| 𝑖∈𝑉 𝐶𝑖 High likelihood of having links between “friends of friends” Motivation Structural Patterns GeneratorsDecomposition Conclusion Wedge at 𝑖: open triangle 𝑖
- 16. 16/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P4. Small Effective Diameter d = 8 90% of pairs Most pairs of connected nodes: reachable within a small distance Motivation Structural Patterns GeneratorsDecomposition Conclusion https://web.stanford.edu/class/cs224w/handouts/02-gnp-smallworld.pdf
- 17. 17/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns: Intuition Real-world graphs P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular value distribution: heavy-tailed Hypergraph Decomposed graphs All decomposition levels Real-world graph Decomposition Motivation Structural Patterns GeneratorsDecomposition Conclusion J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos. 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
- 18. 18/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Degree distributions: heavy-tailed Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion
- 19. 19/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Singular-value distributions: heavy-tailed Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion
- 20. 20/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Giant connected components vary among datasets Motivation Structural Patterns GeneratorsDecomposition Conclusion If there is a giant connected component • High Clustering Coefficient • Small Effective Diameter Proportion of nodes in the largest connected component Small numbers indicate the absence of a giant connected component
- 21. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns 4. Generators << 5. Conclusions 21/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
- 22. 22/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Model: HyperPA 1 2 3 A new node: 4 1: 2 2: 2 3: 1 (1,2): 2 (1,3): 1 (2,3): 1 (1,2,3): 1 2 hyperedges 1st hyperedge: size 2 {2,4} 2nd hyperedge: size 3 {1,3,4} {1,2} {1,2,3} Motivation Structural Patterns GeneratorsDecomposition Conclusion Main idea: “Subsets get rich together”
- 23. 23/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Model: HyperPA Main idea: “Subsets get rich together” Introduce a new node Add the newly formed hyperedges Update subset degrees Repeat the process 1 2 3 1: 3 2: 3 3: 2 (1,2): 2 (1,3): 2 (2,3): 1 (1,2,3): 1 4 4: 2 (1,3,4): 1 (1,4): 1 (2,4): 1 {1,2} {1,2,3} (3,4): 1 A new node: 5 Motivation Structural Patterns GeneratorsDecomposition Conclusion {2,4} {1,3,4}
- 24. 24/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion HyperPA Real HyperPA: considers degrees of groups of nodes Our Model: HyperPA
- 25. 25/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Node level Edge level Triangle level 4clique level NaivePA Real Motivation Structural Patterns GeneratorsDecomposition Conclusion NaivePA: considers node degrees individually Baseline: NaïvePA
- 26. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns 4. Generators 5. Conclusions << 26/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
- 27. 27/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Conclusion Motivation Structural Patterns GeneratorsDecomposition Conclusion • Our contributions in this work: ◦ Decomposition Tool: convenient analysis of hypergraphs ◦ Structural Patterns: 5 patterns across decomposition levels ◦ HyperPA: generator reproducing the 5 structural patterns P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular-value distribution: heavy-tailed
- 28. Structural Patterns and Generative Models of Real-world Hypergraphs Manh Tuan Do, Se-eun Yoon, Bryan Hooi, Kijung Shin 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2020 Contact: Manh Tuan Do (manh.it97@kaist.ac.kr)