Successfully reported this slideshow.
Upcoming SlideShare
×

# Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm

2,540 views

Published on

Presentation for ICDM 2016

Published in: Science
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm

1. 1. Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Takuya Akiba (Preferred Networks, Inc.) Kenko Nakamura (Recruit Communications., Ltd.) Taro Takaguchi (National Institute of Information and Communications Technology) *Work done while all authors were at National Institute of Informatics 1
2. 2. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Fractality of networks 2 Some of real-world networks are fractal. [Song+, Nature’05]
3. 3. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm ▶ box := set of vertices within a radius of ℓ ▶b(ℓ) := number of boxes needed to cover the whole graph ▶ graph said to be fractal ⇔ b(ℓ) ∝ ℓ−d Definition of Graph Fractality 3 ← Fractal network model
4. 4. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm ▶ b(ℓ) := number of boxes needed to cover the whole graph Box-Covering Problem 4 Box-Covering Problem : Determination of the fractality ▶ Minimize b(ℓ) ▶ Box-Covering Problem is NP-Hard ▶ Approximation algorithms are used
5. 5. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Box-Covering Problem Previous Algorithms computation time is too long! infeasible for networks with millions of vertices 5 This Work near-liner time complexity works with tens of millions of vertices
6. 6. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Compared with Previous Method Previous Naive Method [Song+’05] ▶ Step 1: Instantiate all boxes BFS from each vertex ▶ Step 2: Solve set cover problem Greedy algorithm with approximation ratio 1 ＋ ln n Proposed Method ▶ Step 1: Instantiate Min-Hash of all boxes Similar to algorithms for All-Distances Sketches ▶ Step 2: Solve set cover problem in the sketch-space Near-linear time complexity by using BST and Heap 6
7. 7. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Experimental Results Computation Time Memory Usage Environment: Intel Xeon 2.67GHz, 96GB 10 times faster than the previous algorithms Flower model BA model
8. 8. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Real Large Network ▶ Web graph with 1M vertices and 17M edges (in-2004) – 11.7 hours in total ▶ Fractality analysis of million-scale network for the first time 8
9. 9. Akiba+ | Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Covering Algorithm Summary Background: Fractality of real-world network ▶ Some of the real-world networks are fractal. ▶ Lack of an efficient algorithm Proposed Method: Box-Covering on Min-Hash ▶ Avoid explicit representation of boxes ▶ Efficient Min-Hash computation: Similar to ADS ▶ Efficient Greedy by Binary Search Tree and Heap ▶ Fractality analysis of the network with 17M edges 9