Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kdd'20 presentation 223

Presentation slides for the paper 'Structural Patterns and Generative Models of Real-world Hypergraphs'. Published in KDD2020 - ACM SIGKDD International Conference on Knowedge Discovery and Data Mining

  • Be the first to comment

  • Be the first to like this

Kdd'20 presentation 223

  1. 1. Structural Patterns and Generative Models of Real-world Hypergraphs Manh Tuan Do, Se-eun Yoon, Bryan Hooi, Kijung Shin 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2020 Contact: Manh Tuan Do (manh.it97@kaist.ac.kr)
  2. 2. Roadmap 1. Introduction << 2. Decomposition 3. Structural Patterns 4. Generators 5. Conclusions 2/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
  3. 3. 3/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Coauthorship Email Tags Common patterns Underlying mechanisms Motivation Structural Patterns GeneratorsDecomposition Conclusion Hypergraphs
  4. 4. Roadmap 1. Motivation 2. Decomposition << 3. Structural Patterns 4. Generators 5. Conclusions 4/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
  5. 5. • Hypergraphs: not straightforward to analyze o complex representation o lack tools 5/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Only interactions at the level of nodes 1 2 3 4 5 7 6 1 2 3 4 7 5 6 Motivation for a New Tool Motivation Structural Patterns GeneratorsDecomposition Conclusion • Projection o information loss o no high-order level information
  6. 6. • Multi-level decomposition: ◦ representation by pair-wise graphs ◦ leveraging existing tools & measurements ◦ no information loss: original hypergraph is reconstructible 6/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Tool: Decomposition Motivation Structural Patterns GeneratorsDecomposition Conclusion 1 7 2 42 3 1 4 3 4 1 21 3 5 6 4 5 3 5 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 7 5 6
  7. 7. 7/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) 1 2 3 4 5 7 6 {1, 2, 3, 4} {3, 4, 5} {5, 6} {1, 7} 1 2 3 4 7 5 6 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 Node level Triangle level 4clique level Our Tool: Decomposition 2 42 3 1 4 3 4 1 21 31 7 5 6 4 5 3 5 Edge level Motivation Structural Patterns GeneratorsDecomposition Conclusion
  8. 8. 8/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) {1, 2, 3, 4} {3, 4, 5} {5, 6} {1, 7} Maximum hyperedge size: 𝑁 Our Tool: Decomposition Hypergraph Decomposed graphs Node-level Edge-level ….. (𝑁-1)-clique level Decomposition Reconstruction 1 7 2 42 3 1 4 3 4 1 21 3 5 6 4 5 3 5 3 4 5 1 2 3 1 2 4 1 3 4 2 3 4 1 2 3 4 7 5 6 1 2 3 4 5 7 6 Motivation Structural Patterns GeneratorsDecomposition Conclusion
  9. 9. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns << 4. Generators 5. Conclusions 9/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
  10. 10. • 13 datasets from 6 domains ◦ Email: recipient addresses of an email ◦ Drug components: classes or substances within a single drug, listed in the National Drug Code Directory ◦ Drug use: drugs used by a patient, reported to the Drug Abuse Warning Network, before an emergency visit ◦ Online tags: tags in a question in Stack Exchange forums ◦ Online threads: users answering a question in Stack Exchange forums ◦ Coauthorship: coauthors of a publications 10/26Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Motivation Structural Patterns GeneratorsDecomposition Conclusion Real-word Datasets
  11. 11. 11/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular value distribution: heavy-tailed Motivation Structural Patterns GeneratorsDecomposition Conclusion
  12. 12. 12/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P1+P5. Heavy-tailed Distributions Abundant low- degree nodes A few high- degree nodes Degree Singular values Motivation Structural Patterns GeneratorsDecomposition Conclusion Degree and singular-value distributions are heavy-tailed J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
  13. 13. 13/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Motivation Structural Patterns GeneratorsDecomposition Conclusion Statistical tests: to confirm heavy-tailed distributions Lilliefors Test(1) • 𝐻0: distribution is exponential • 𝐻1: distribution is not exponential 𝐻0 rejected at 2.5% significance level Log likelihood ratio(2) 𝑟 = 𝑙𝑜𝑔 𝐿1 𝐿0 • 𝐿1 : likelihood of a heavy-tailed distribution (power-law, log normal) • 𝐿0 : likelihood of the exponential distribution If 𝑟 > 0 : the distribution is more likely to be heavy-tailed. P1+P5. Heavy-tailed Distributions (1) Hubert W Lilliefors. 1969. On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown. Journal of American Statististical Association 64 (1969), 387–389. (2) Jeff Alstott and Dietmar Plenz Bullmore. 2014. powerlaw: a Python package for analysis of heavy-tailed distributions. PloS one 9, 1 (2014)
  14. 14. 14/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P2. Giant Connected Component A large proportion of nodes are connected Connected components Proportionofnodes Motivation Structural Patterns GeneratorsDecomposition Conclusion J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos . 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
  15. 15. 15/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P3. High Clustering Coefficient Local clustering coefficient: 𝐶𝑖 = 2 ∗ 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠 𝑎𝑡 𝑖 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑒𝑑𝑔𝑒𝑠 𝑎𝑡 𝑖 Clustering coefficient: 𝐶 = 1 |𝑉| 𝑖∈𝑉 𝐶𝑖 High likelihood of having links between “friends of friends” Motivation Structural Patterns GeneratorsDecomposition Conclusion Wedge at 𝑖: open triangle 𝑖
  16. 16. 16/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) P4. Small Effective Diameter d = 8 90% of pairs Most pairs of connected nodes: reachable within a small distance Motivation Structural Patterns GeneratorsDecomposition Conclusion https://web.stanford.edu/class/cs224w/handouts/02-gnp-smallworld.pdf
  17. 17. 17/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns: Intuition Real-world graphs P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular value distribution: heavy-tailed Hypergraph Decomposed graphs All decomposition levels Real-world graph Decomposition Motivation Structural Patterns GeneratorsDecomposition Conclusion J. Leskovec, J. Kleinberg, Cornell, C. Faloutsos. 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In KDD
  18. 18. 18/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Degree distributions: heavy-tailed Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion
  19. 19. 19/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Singular-value distributions: heavy-tailed Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion
  20. 20. 20/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Structural Patterns Giant connected components vary among datasets Motivation Structural Patterns GeneratorsDecomposition Conclusion If there is a giant connected component • High Clustering Coefficient • Small Effective Diameter Proportion of nodes in the largest connected component Small numbers indicate the absence of a giant connected component
  21. 21. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns 4. Generators << 5. Conclusions 21/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
  22. 22. 22/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Model: HyperPA 1 2 3 A new node: 4 1: 2 2: 2 3: 1 (1,2): 2 (1,3): 1 (2,3): 1 (1,2,3): 1 2 hyperedges 1st hyperedge: size 2 {2,4} 2nd hyperedge: size 3 {1,3,4} {1,2} {1,2,3} Motivation Structural Patterns GeneratorsDecomposition Conclusion Main idea: “Subsets get rich together”
  23. 23. 23/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Our Model: HyperPA Main idea: “Subsets get rich together” Introduce a new node Add the newly formed hyperedges Update subset degrees Repeat the process 1 2 3 1: 3 2: 3 3: 2 (1,2): 2 (1,3): 2 (2,3): 1 (1,2,3): 1 4 4: 2 (1,3,4): 1 (1,4): 1 (2,4): 1 {1,2} {1,2,3} (3,4): 1 A new node: 5 Motivation Structural Patterns GeneratorsDecomposition Conclusion {2,4} {1,3,4}
  24. 24. 24/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Node level Edge level Triangle level 4clique level Motivation Structural Patterns GeneratorsDecomposition Conclusion HyperPA Real HyperPA: considers degrees of groups of nodes Our Model: HyperPA
  25. 25. 25/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Node level Edge level Triangle level 4clique level NaivePA Real Motivation Structural Patterns GeneratorsDecomposition Conclusion NaivePA: considers node degrees individually Baseline: NaïvePA
  26. 26. Roadmap 1. Motivation 2. Decomposition 3. Structural Patterns 4. Generators 5. Conclusions << 26/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do)
  27. 27. 27/28Structural Patterns and Generative Models of Real-world Hypergraphs (by Manh Tuan Do) Conclusion Motivation Structural Patterns GeneratorsDecomposition Conclusion • Our contributions in this work: ◦ Decomposition Tool: convenient analysis of hypergraphs ◦ Structural Patterns: 5 patterns across decomposition levels ◦ HyperPA: generator reproducing the 5 structural patterns P1. Degree distribution: heavy-tailed P2. Connected component: giant P3. Clustering coefficient: high P4. Effective diameter: small P5. Singular-value distribution: heavy-tailed
  28. 28. Structural Patterns and Generative Models of Real-world Hypergraphs Manh Tuan Do, Se-eun Yoon, Bryan Hooi, Kijung Shin 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2020 Contact: Manh Tuan Do (manh.it97@kaist.ac.kr)

×