Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

"SSumM: Sparse Summarization of Massive Graphs", KDD 2020

A presentation slides of Kyuhan Lee, Hyeonsoo Jo, Jihoon Ko, Sungsu Lim, Kijung Shin, "SSumM: Sparse Summarization of Massive Graphs", KDD 2020.

Given a graph G and the desired size k in bits, how can we summarize G within k bits, while minimizing the information loss?
Large-scale graphs have become omnipresent, posing considerable computational challenges. Analyzing such large graphs can be fast and easy if they are compressed sufficiently to fit in main memory or even cache. Graph summarization, which yields a coarse-grained summary graph with merged nodes, stands out with several advantages among graph compression techniques. Thus, a number of algorithms have been developed for obtaining a concise summary graph with little information loss or equivalently small reconstruction error. However, the existing methods focus solely on reducing the number of nodes, and they often yield dense summary graphs, failing to achieve better compression rates. Moreover, due to their limited scalability, they can be applied only to moderate-size graphs.
In this work, we propose SSumM, a scalable and effective graph-summarization algorithm that yields a sparse summary graph. SSumM not only merges nodes together but also sparsifies the summary graph, and the two strategies are carefully balanced based on the minimum description length principle. Compared with state-of-the-art competitors, SSumM is (a) Concise: yields up to 11.2X smaller summary graphs with similar reconstruction error, (b) Accurate: achieves up to 4.2X smaller reconstruction error with similarly concise outputs, and (c) Scalable: summarizes 26X larger graphs while exhibiting linear scalability. We validate these advantages through extensive experiments on 10 real-world graphs.

  • Be the first to comment

  • Be the first to like this

"SSumM: Sparse Summarization of Massive Graphs", KDD 2020

  1. 1. SSumM : Sparse Summarization of Massive Graphs Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin
  2. 2. Graphs are Everywhere Citation networksSubway networks Internet topologies Introduction Algorithms Experiments ConclusionProblem Designed by Freepik from Flaticon
  3. 3. Massive Graphs Appeared Social networks 2.49 Billion active users Purchase histories 0.5 Billion products World Wide Web 5.49 Billion web pages Introduction Algorithms Experiments ConclusionProblem
  4. 4. Difficulties in Analyzing Massive graphs Computational cost (number of nodes & edges) Introduction Algorithms Experiments ConclusionProblem
  5. 5. Difficulties in Analyzing Massive graphs > Cannot fit Introduction Algorithms Experiments ConclusionProblem Input Graph
  6. 6. Solution: Graph Summarization > Introduction Algorithms Experiments ConclusionProblem Can fit Summary Graph
  7. 7. Advantages of Graph Summarization Introduction Algorithms Experiments ConclusionProblem โ€ข Many graph compression techniques are available โ€ข TheWebGraph Framework [BV04] โ€ข BFS encoding [AD09] โ€ข SlashBurn [KF11] โ€ข VoG [KKVF14] โ€ข Graph summarization stands out because โ€ข Elastic: reduce size of outputs as much as we want โ€ข Analyzable: existing graph analysis and tools can be applied โ€ข Combinable for Additional Compression: can be further compressed
  8. 8. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Introduction Algorithms Experiments ConclusionProblem Input Graph
  9. 9. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph Introduction Algorithms Experiments ConclusionProblem Subnode Subedge Supernode
  10. 10. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =3 Introduction Algorithms Experiments ConclusionProblem Subnode Subedge Superedge Supernode
  11. 11. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =2 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =3 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =2 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ} =1 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =3 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =5 Introduction Algorithms Experiments ConclusionProblem
  12. 12. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =2 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =3๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ} =1 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =3 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =5 Introduction Algorithms Experiments ConclusionProblem
  13. 13. Example of Graph Summarization Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 Reconstructed Adjacency Matrix ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =2 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =3๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ} =1 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =3 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =5
  14. 14. Example of Graph Summarization ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ={1,2} ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•={7,8,9} ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘={3,4,5,6} Summary Graph ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =2 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =3๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ} =1 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•, ๐‘ฝ๐‘ฝ๐Ÿ•๐Ÿ•} =3 ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} =5 Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 Reconstructed Adjacency Matrix ๐Ž๐Ž {๐‘ฝ๐‘ฝ๐Ÿ๐Ÿ, ๐‘ฝ๐‘ฝ๐Ÿ‘๐Ÿ‘} = 3 ๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€ ๐‘€๐‘€๐‘€๐‘€๐‘€๐‘€ ๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘›๐‘› ๐‘œ๐‘œ๐‘œ๐‘œ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘  = 8
  15. 15. Road Map โ€ข Introduction โ€ข Problem << โ€ข Proposed Algorithm: SSumM โ€ข Experimental Results โ€ข Conclusions
  16. 16. Problem Definition: Graph Summarization ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ: a graph ๐‘ฎ๐‘ฎ and the target number of node ๐‘ฒ๐‘ฒ ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ: a summary graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘ป๐‘ป๐‘ป๐‘ป ๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด: the difference between graph ๐‘ฎ๐‘ฎand the restored graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’•๐’•๐’•๐’•: the number of supernodes in ๏ฟฝ๐‘ฎ๐‘ฎ โ‰ค ๐‘ฒ๐‘ฒ Introduction Algorithms Experiments ConclusionProblem
  17. 17. Problem Definition: Graph Summarization ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ: a graph ๐‘ฎ๐‘ฎ and the target number of node ๐‘ฒ๐‘ฒ ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ: a summary graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘ป๐‘ป๐‘ป๐‘ป ๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด: the difference between graph ๐‘ฎ๐‘ฎand the restored graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’•๐’•๐’•๐’•: the number of supernodes in ๏ฟฝ๐‘ฎ๐‘ฎ โ‰ค ๐‘ฒ๐‘ฒ Shouldnโ€™t we consider sizes? Introduction Algorithms Experiments ConclusionProblem
  18. 18. Problem Definition: Graph Summarization ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ๐‘ฎ: a graph ๐‘ฎ๐‘ฎ and the desired size ๐‘ฒ๐‘ฒ (in bits) ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ๐‘ญ: a summary graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘ป๐‘ป๐‘ป๐‘ป ๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด๐‘ด: the difference with graph graph ๐‘ฎ๐‘ฎand the restored graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’•๐’•๐’•๐’•: size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits โ‰ค ๐‘ฒ๐‘ฒ Introduction Algorithms Experiments ConclusionProblem
  19. 19. Details: Size in Bits of a Graph ๐ธ๐ธ : set of edges ๐‘‰๐‘‰ : set of nodes Encoded using log2|๐‘‰๐‘‰| bits Introduction Algorithms Experiments ConclusionProblem ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’๐’๐’๐’ ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ: 2 ๐ธ๐ธ log2 ๐‘‰๐‘‰ Input graph ๐‘ฎ๐‘ฎ
  20. 20. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’๐’๐’๐’ ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ: ๐‘ƒ๐‘ƒ 2 log2 ๐‘†๐‘† + log2 ๐œ”๐œ”๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š + ๐‘‰๐‘‰ log2 ๐‘†๐‘† ๐‘†๐‘† : set of supernodes ๐‘ƒ๐‘ƒ : set of superedges ๐‘ค๐‘ค๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š : maximum superedge weight Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ
  21. 21. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’๐’๐’๐’ ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ: ๐‘ƒ๐‘ƒ 2 log2 ๐‘†๐‘† + log2 ๐œ”๐œ”๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š + ๐‘‰๐‘‰ log2 ๐‘†๐‘† ๐‘†๐‘† : set of supernodes ๐‘ƒ๐‘ƒ : set of superedges ๐‘ค๐‘ค๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š : maximum superedge weight Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ
  22. 22. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’๐’๐’๐’ ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ: ๐‘ƒ๐‘ƒ 2 log2 ๐‘†๐‘† + log2 ๐œ”๐œ”๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š + ๐‘‰๐‘‰ log2 ๐‘†๐‘† ๐‘†๐‘† : set of supernodes ๐‘ƒ๐‘ƒ : set of superedges ๐‘ค๐‘ค๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š : maximum superedge weight Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ
  23. 23. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ๐‘บ ๐’๐’๐’๐’ ๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’”๐’” ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ๐’ˆ: ๐‘ƒ๐‘ƒ 2 log2 ๐‘†๐‘† + log2 ๐œ”๐œ”๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š + ๐‘‰๐‘‰ log2 ๐‘†๐‘† ๐‘†๐‘† : set of supernodes ๐‘ƒ๐‘ƒ : set of superedges ๐‘ค๐‘ค๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š : maximum superedge weight Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ
  24. 24. Details: Error Measurement Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Reconstructed Adjacency Matrix ๏ฟฝ๐‘จ๐‘จReconstructed Adjacency Matrix ๐‘จ๐‘จ ๐‘…๐‘…๐ธ๐ธ๐‘๐‘(๐‘จ๐‘จ, ๏ฟฝ๐‘จ๐‘จ) = ๏ฟฝ ๐‘–๐‘–=1 ๐‘‰๐‘‰ ๏ฟฝ ๐‘—๐‘—=1 ๐‘‰๐‘‰ ๐ด๐ด ๐‘–๐‘–, ๐‘—๐‘— โˆ’ ฬ‚๐ด๐ด ๐‘–๐‘–, ๐‘—๐‘— ๐‘๐‘ ๐Ÿ๐Ÿ ๐’‘๐’‘
  25. 25. Road Map โ€ข Introduction โ€ข Problem โ€ข Proposed Algorithm: SSumM << โ€ข Experimental Results โ€ข Conclusions
  26. 26. Main ideas of SSumM Introduction Algorithms Experiments ConclusionProblem Combines node grouping and edge sparsification Prunes search space Balances error and size of the summary graph using MDL principle โ€ข Practical graph summarization problem โ—ฆ Given: a graph ๐‘ฎ๐‘ฎ โ—ฆ Find: a summary graph ๏ฟฝ๐‘ฎ๐‘ฎ โ—ฆ To minimize: the difference between ๐‘ฎ๐‘ฎand the restored graph ๏ฟฝ๐‘ฎ๐‘ฎ โ—ฆ Subject to: ๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† ๐‘œ๐‘œ๐‘œ๐‘œ ๏ฟฝ๐‘ฎ๐‘ฎ in bits โ‰ค ๐‘ฒ๐‘ฒ
  27. 27. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  28. 28. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  29. 29. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  30. 30. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  31. 31. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6}Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) How to choose a next action?
  32. 32. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6}Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action?
  33. 33. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6} Summary graph size + Information loss Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action?
  34. 34. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6} Summary graph size + Information loss MDL Principle Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action? arg min ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ ๏ฟฝ๐‘ฎ๐‘ฎ + ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ๐ถ(๐‘ฎ๐‘ฎ|๏ฟฝ๐‘ฎ๐‘ฎ) ๏ฟฝ๐‘ฎ๐‘ฎ # ๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘ ๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ ๏ฟฝ๐‘ฎ๐‘ฎ # ๐‘๐‘๐‘๐‘๐‘๐‘๐‘๐‘ ๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ๐‘Ÿ ๐‘ฎ๐‘ฎ ๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข๐‘ข ๏ฟฝ๐‘ฎ๐‘ฎ
  35. 35. Overview: SSumM ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase Introduction Algorithms Experiments ConclusionProblem Procedure โ€ข Given: โ—ฆ (1) An input graph ๐‘ฎ๐‘ฎ, (2) the desired size ๐‘ฒ๐‘ฒ, (3) the number ๐‘ป๐‘ป of iterations โ€ข Outputs: โ—ฆ Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ
  36. 36. Input graph ๐‘ฎ๐‘ฎ Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase << ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase ๐‘Ž๐‘Ž ๐‘๐‘๐‘๐‘ ๐‘‘๐‘‘ ๐‘’๐‘’ ๐‘“๐‘“ ๐‘”๐‘” โ„Ž ๐‘–๐‘– ๐ด๐ด = {๐‘Ž๐‘Ž} ๐ต๐ต = {๐‘๐‘}C = {๐‘๐‘} ๐ท๐ท = {๐‘‘๐‘‘} ๐ธ๐ธ = {๐‘’๐‘’} ๐น๐น = {๐‘“๐‘“} ๐บ๐บ = {๐‘”๐‘”} ๐ป๐ป = {โ„Ž} ๐ผ๐ผ = {๐‘–๐‘–} Initialization Phase
  37. 37. Candidate Generation Phase ๐ต๐ต = {๐‘๐‘} ๐ถ๐ถ = {๐‘๐‘} D = {๐‘‘๐‘‘} ๐ธ๐ธ = {๐‘’๐‘’} ๐ด๐ด = {๐‘Ž๐‘Ž} ๐น๐น = {๐‘“๐‘“} ๐บ๐บ = {๐‘”๐‘”} ๐ป๐ป = {โ„Ž} ๐ผ๐ผ = {๐‘–๐‘–} Input graph ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase << ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase ๐‘Ž๐‘Ž ๐‘๐‘๐‘๐‘ ๐‘‘๐‘‘ ๐‘’๐‘’ ๐‘“๐‘“ ๐‘”๐‘” โ„Ž ๐‘–๐‘–
  38. 38. Merging and Sparsification Phase For each candidate set ๐‘ช๐‘ช Among possible candidate pairs Introduction Algorithms Experiments ConclusionProblem (A, B) (B, C) (C, D) (D, E) (A, C) (B, D) (C, E) (A, D) (B, E) (A, E) ๐ต๐ต = {๐‘๐‘} ๐ถ๐ถ = {๐‘๐‘} D = {๐‘‘๐‘‘} ๐ด๐ด = {๐‘Ž๐‘Ž} Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase ๐ธ๐ธ = {๐‘’๐‘’}
  39. 39. Merging and Sparsification Phase For each candidate set ๐‘ช๐‘ช Among possible candidate pairs Introduction Algorithms Experiments ConclusionProblem (A, B) (B, C) (C, D) (D, E) (A, C) (B, D) (C, E) (A, D) (B, E) (A, E) ๐ต๐ต = {๐‘๐‘} ๐ถ๐ถ = {๐‘๐‘} D = {๐‘‘๐‘‘} ๐ด๐ด = {๐‘Ž๐‘Ž} Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase Sample log2 |๐‘ช๐‘ช| pairs๐ธ๐ธ = {๐‘’๐‘’}
  40. 40. Merging and Sparsification Phase Select the pair with the greatest (relative) reduction in the cost function ๐’Š๐’Š๐’Š๐’Š reduction(C, D) > ๐œฝ๐œฝ: merge(C, D) ๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’† sample log2 |๐‘ช๐‘ช| pairs again Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase (A, B) (A, D) (C, D)
  41. 41. Merging and Sparsification Phase Select the pair with the greatest (relative) reduction in the cost function ๐’Š๐’Š๐’Š๐’Š reduction(C, D) > ๐œฝ๐œฝ: merge(C, D) ๐’†๐’†๐’†๐’†๐’†๐’†๐’†๐’† sample log2 |๐‘ช๐‘ช| pairs again Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase (A, B) (A, D) (C, D)
  42. 42. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘}๐ถ๐ถ = {๐‘๐‘} ๐ท๐ท = {๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  43. 43. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  44. 44. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} C = {๐‘๐‘, ๐‘‘๐‘‘}
  45. 45. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} C = {๐‘๐‘, ๐‘‘๐‘‘}
  46. 46. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase Sparsify or not according to total description cost {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} C = {๐‘๐‘, ๐‘‘๐‘‘}
  47. 47. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase Sparsify or not according to total description cost {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} C = {๐‘๐‘, ๐‘‘๐‘‘}
  48. 48. Merging and Sparsification Phase (cont.) Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase << ๏‚ง Further sparsification phase Sparsify or not according to total description cost {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} C = {๐‘๐‘, ๐‘‘๐‘‘}
  49. 49. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  50. 50. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  51. 51. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  52. 52. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {โ„Ž} {๐‘”๐‘”, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  53. 53. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {โ„Ž} {๐‘”๐‘”, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  54. 54. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {โ„Ž} {๐‘”๐‘”, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  55. 55. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  56. 56. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  57. 57. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–}
  58. 58. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} {๐‘Ž๐‘Ž, ๐‘’๐‘’}
  59. 59. Repetition Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Introduction Algorithms Experiments ConclusionProblem Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ Different candidate sets and decreasing threshold ๐œƒ๐œƒ over iteration Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐’•๐’•++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–} {๐‘Ž๐‘Ž} {๐‘๐‘} {๐‘๐‘, ๐‘‘๐‘‘} {๐‘’๐‘’} {๐‘“๐‘“} {๐‘”๐‘”} {โ„Ž} {๐‘–๐‘–} {๐‘Ž๐‘Ž, ๐‘’๐‘’}
  60. 60. Introduction Algorithms Experiments ConclusionProblem Further Sparsification Phase Summary graph ๏ฟฝ๐‘ฎ๐‘ฎ ๐ด๐ด ๐ด๐ด ๐ต๐ต ๐ด๐ด ๐น๐น ๐ด๐ด ๐บ๐บ ๐บ๐บ Superedges sorted by โˆ†๐‘…๐‘…๐ธ๐ธ๐‘๐‘ ๐ถ๐ถ ๐ถ๐ถ Size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits โ‰ค ๐‘ฒ๐‘ฒ Procedure ๏‚ง Initialization phase ๏‚ง ๐‘ก๐‘ก = 1 ๏‚ง While ๐‘ก๐‘ก++ โ‰ค ๐‘ป๐‘ป and ๐‘ฒ๐‘ฒ < size of ๏ฟฝ๐‘ฎ๐‘ฎ in bits ๏‚ง Candidate generation phase ๏‚ง Merge and sparsification phase ๏‚ง Further sparsification phase << C = {๐‘๐‘, ๐‘‘๐‘‘} ๐ด๐ด = {๐‘Ž๐‘Ž, ๐‘’๐‘’} ๐ต๐ต = {๐‘๐‘} ๐น๐น = {๐‘“๐‘“} ๐บ๐บ = {๐‘”๐‘”, โ„Ž, ๐‘–๐‘–}
  61. 61. Road Map โ€ข Introduction โ€ข Problem โ€ข Proposed Algorithm: SSumM โ€ข Experimental Results << โ€ข Conclusions
  62. 62. โ€ข 10 datasets from 6 domains (up to 0.8B edges) โ€ข Three competitors for graph summarization โ—ฆ k-Gs [LT10] โ—ฆ S2L [RSB17] โ—ฆ SAA-Gs [BAZK18] Introduction Algorithms Experiments ConclusionProblem Social Internet Email Co-purchase Collaboration Hyperlinks Experiments Settings
  63. 63. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal DBLP Amazon-0302Skitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem o.o.t. >12hours o.o.m. >64GB SSumM Gives Concise and Accurate Summary
  64. 64. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal DBLP Amazon-0302Skitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem o.o.t. >12hours o.o.m. >64GB SSumM Gives Concise and Accurate Summary
  65. 65. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal Amazon-0601 DBLPSkitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem SSumM is Fast
  66. 66. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal Amazon-0601 DBLPSkitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem SSumM is Fast
  67. 67. Introduction Algorithms Experiments ConclusionProblem SSumM is Scalable
  68. 68. Introduction Algorithms Experiments ConclusionProblem SSumM Converges Fast
  69. 69. Road Map โ€ข Introduction โ€ข Problem โ€ข Proposed Algorithm: SSumM โ€ข Experimental Results โ€ข Conclusions <<
  70. 70. Code available at https://github.com/KyuhanLee/SSumM Concise & Accurate Fast Scalable Introduction Algorithms Experiments ConclusionProblem Practical Problem Formulation Extensive Experiments on 10 real world graphs Scalable and Effective Algorithm Design Conclusions
  71. 71. SSumM : Sparse Summarization of Massive Graphs Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin

ร—