Fast community structure identification of small world networks

Fast community structure
identification of small world
networks
脇田研究室
池光龍
12016/10/2

社会ネットワーク
• 社会ネットワークとは、
– 個人や組織、それらの活動によって形成された
構造
– 例：友人関係、取引関係、論文参照関係
22016/10/2

社会ネットワークの性質
• スモールワールド性 [1]
– コミュニティ構造
– 短い平均距離
• スケールフリー性 [2]
– 次数分布のベキ乗則
– 「ハブ」の存在
2016/10/2 3
• [1]D.J. Watts. Six Degrees: The Science of a Connected Age. W.W. Norton & Company, 2003.
• [2]A.L. Barabási. Linked: The New Science of Networks. Perseus Pub., 2002.

社会ネットワークの性質
• スモールワールド性 [1]
– コミュニティ構造
– 短い平均距離
• スケールフリー性 [2]
– 次数分布のベキ乗則
– 「ハブ」の存在
2016/10/2 4
• [1]D.J. Watts. Six Degrees: The Science of a Connected Age. W.W. Norton & Company, 2003.
• [2]A.L. Barabási. Linked: The New Science of Networks. Perseus Pub., 2002.

コミュニティ発見法
52016/10/2

コミュニティ発見法
62016/10/2

様々な手法
• Cluster Analysis
– Ward method (Ward, 1963)
– K-NN法 (Franco-Lopez, Hector+, 2001)
• Graph partition
– Belief Propagation ( Onsjo, Watanabe, 2006)
– k-means法 (Hartigan, John, Manchek, Wong, 1979)
• Matrix Data
– Stochastic Block Model (Snijders, Tom, Krzysztof, 1997)
– Spectral Clustering ( Shi, Malik, 2000 )
• Metric optimization
– WCC (Prat-Pérez, Dominguez-Sal, Larriba-Pey, 2014)
– Edge Betweenness (Girvan, Michelle, Newman, 2002)
72016/10/2

Modularity
• Modularityとは
– コミュニティ発見の善し悪しを評価する指標。
– -1~1
2016/10/2 8
• M.E.J. Newman, M. Girvan, Finding and evaluating community structure in
networks. Phys. Rev. E 69, 026113 (2004)
Modularity = 0.650 Modularity = 0.873

Modularity最適化法
• Modularityが最大になるように、コミュニティ
を発見する手法
手法特徴処理可能規模
Newman+ (2004) Modularity概念一万ノード
Clauset+ (2004) 効率的なデータ構造五十万ノード
Wakita+ (2007) 合併比率数百万ノード
Blondel+ (2008) 局所Modularity最適化一億ノード
Shiokawa+ (2013) 逐次集約一億ノード
92016/10/2

本研究の着目点
• 社会ネットワークのスモールワールド性に注目
• 新しいメトリックの作成やテクニック上の向上で
はなく、発見効率の観点から接近
• 評価法として、解析のための評価指標が必要
102016/10/2

Louvain法:All Neighbor Selection
2016/10/2
• Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of
Statistical Mechanics: Theory and Experiment 2008.10 (2008): P10008.

Louvain法:All Neighbor Selection
2016/10/2

基本提案:Neighbor Random
Sampling
2016/10/2

反復回数(万回)
Modularity
Modularity最適化法の計算効率
基本提案: Neighbor Random Sampling
Louvain: All Neighbor Selection
31931218
192016/10/2
1.5倍
2.6倍

Max Neighbor Selection
• 隣接コミュニティ選択に注
目
• わずかな選択とLouvainの
選択が合致出来れば？
• Modularityの変化量
• 計算の後半部分を観察
–
– 3個：約3割 Louvainの選択と
合致
• Max neighbor ３個だけ参照
2014/08/05
2016/10/2 20
| MaxNeighbor |
| SelectedNeighor |
DModularity =(| E |*|v ®ci |-|v ®V |*|ci ®V |)
v
c1
c2
c3
c4
c5
c6

Changed Neighbor Selection
• ノードの選択に注目
• 無駄な計算が沢山存在
– ノードの移動なし
• 理想的
– 本当に動くノードだけ特定
• 隣接コミュニティに変
化があるノードのみ選
択
– 移動可能ノードを特定
2016/10/2 21
2014/08/05

Hybrid Heuristic
• 比較対象: 3個 Neighbor
• ３個以下
– Louvain法
• ３個以上
– 計算前半: Neighbor Random
Sampling
– 計算後半: Max Neighbor
Selection
– 計算全般: Changed Neighbor
Selection
• Neighbor Random Sampling
とMax Neighbor Selection
の切り替え
– 計算効率曲線の傾き
2016/10/2 22
反復回数(万回)
Modularity
基本提案: Neighbor Random Sampling
222016/10/2
Louvain: All Neighbor Selection

実験
• 実験内容
– Modularityはほぼ同じ
– 計算効率
– 時間
• 実行環境:
– Tsubame interactive node
• 6GB RAM, Intel Xeon CPU X5670 2.93GHz
2016/10/2 23

データセット
• データセット
– 人工データ
2016/10/2 24
• [1] Duncan JWatts and Steven H Strogatz. Collective dynamics of‘small-world’networks.
nature, 393(6684):440{442, 1998.
• [2] Albert-László Barabási and Réka Albert. Emergence of scaling in random networks.
science, 286(5439):509{512, 1999.
Web-Google DBLP Youtube Pokec
|V| 875,713 317,080 1,134,890 1,632,803
|E| 5,105,039 1,049,866 2,987,624 30,622,564
(2*|E|)/|V| 11 7 5 37.5
Small World [1] Scale Free [2] Scale Free [3]
|V| 1,000,000 10,000 1,000,000
|E| 40,000,000 99,970 1,999,998
(2*|E|)/|V| 80 19 39

計算効率 - Pokec
2016/10/2 25

計算効率 – DBLP
2016/10/2 26

計算効率 – Web-Google
2016/10/2 27

計算効率 – Youtube
2016/10/2 28

計算効率 – Small world
2016/10/2 29

計算効率 – Scale free
2016/10/2 30

計算効率 – Scale free
2016/10/2 31

時間の比較
2016/10/2 32
25.7
7.49
13.98
80.27
22.49
31.63
19.45
48.26
222.87
90.93
17.35
10.99
19.76
202.05
43.39
23.31
13.44
51.31
223.98
86.64
16.28
7.51
24.5
217.09
115.82
0
50
100
150
200
250
Web-Google DBLP Youtube Pokec Smallworld
Louvain method Neighbor Random Sampling Heuris c
Max Neighbor Heuris c Changed Neighbor Heuris c
Hybrid Heuris c

まとめ
• スモールワールド性を生かす可能性を提案
• 計算効率の比較により、その可能性を示し
た
• 今後の課題
– 本研究のアイディアに基づいた高速プログラムの
実装
– 収束部分でのノード特定に対する考察
332016/10/2

参考文献
• Duncan J. Watts. (2003). Six Degrees: The Science of a Connected Age. W. W. Norton & Company
• Newman, Mark EJ, and Michelle Girvan. "Finding and evaluating community structure in networks." Physical
review E 69.2 (2004): 026113.
• Newman, Mark EJ. "Fast algorithm for detecting community structure in networks." Physical review E 69.6 (2004):
066133.
• Clauset, Aaron, Mark EJ Newman, and Cristopher Moore. "Finding community structure in very large networks."
Physical review E 70.6 (2004): 066111.
• Wakita, Ken, and Toshiyuki Tsurumi. "Finding community structure in mega-scale social networks:[extended
abstract]." Proceedings of the 16th international conference on World Wide Web. ACM, 2007.
• Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of Statistical Mechanics:
Theory and Experiment 2008.10 (2008): P10008.
• Shiokawa, Hiroaki, Yasuhiro Fujiwara, and Makoto Onizuka. "Fast Algorithm for Modularity-based Graph
Clustering." Twenty-Seventh AAAI Conference on Artificial Intelligence. 2013.
• Bhowmick, Sanjukta, and Sriram Srinivasan. "A Template for Parallelizing the Louvain Method for Modularity
Maximization." Dynamics On and Of Complex Networks, Volume 2. Springer New York, 2013. 111-124.
• Staudt, Christian L., and Henning Meyerhenke. "Engineering High-Performance Community Detection Heuristics
for Massive Graphs." Parallel Processing (ICPP), 2013 42nd International Conference on. IEEE, 2013.
• Prat-Pérez, Arnau, David Dominguez-Sal, and Josep-Lluis Larriba-Pey. "High quality, scalable and parallel
community detection for large real graphs." Proceedings of the 23rd international conference on World wide web.
International World Wide Web Conferences Steering Committee, 2014.
• Onsjö, Mikael, and Osamu Watanabe. "A simple message passing algorithm for graph partitioning problems."
Algorithms and Computation. Springer Berlin Heidelberg, 2006. 507-516.
2016/10/2 34

Community発見の実例（１）
• Data: Belgian Mobile Companyの通話記録
• Node：顧客 Edge: 通話したか否か
• Community 構造 + scale free性
• 結果：フランス語 + オランダ語
• 社会学から言うと
– 言語圏的、民族的、宗教的結束力や脆弱性が見
える
2016/10/2 39
• Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of
Statistical Mechanics: Theory and Experiment 2008.10 (2008): P10008.

Community発見の実例（2）
2016/10/2 40
Du, Nan, et al. "Community detection in large-scale social networks." Proceedings of the 9th
WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis. ACM, 2007.

2016/10/2 42
計算するか
否か？
参照コミュニティは
何なのか？
Louvain Not max
All Neighbor
Community
Not Changed
max
k - Random
Selection
max
k – Max
Selection
Changed
max
k – (Random, Max)
Selection
DModularity

ノードの特定
• 理想的なのは、
– あるノードが属すべきcommunityが一気にわかる
– 毎回移動するノードがわかれば。
– 計算全般でなくても、収束部分だけでも移動すべ
きノードが分かればよい。
– 移動可能ノードだけ分かれば。
• ただし、移動可能ノードだと問題点がある
2016/10/2 43

Changed Neighbor Selectionの問題点
• 周りが一回しか変更されていない場合
– あるcommunityの中に止まっている
– しかし、初期状態で移動したcommunityは、最終
的に属すべきcommunityとは言えない
2016/10/2 44
DModularity =(| E |*|v ®ci |-|v ®V |*|ci ®V |)

New Algorithm?
Community C1
Community C2
!V
V
Community C3
Community C1
Community C2
!V
V
Community C3
Community C4
2016/10/2 45

New Algorithm?
• 毎回移動するノードがわかれば
• 連鎖反応のように、ある一個のノードからはじめ、だんだ
ん移動するノードを次々と移動すればよいのではない
か？
2016/10/2 46
NeighborChaged = true Þ $Dmodularity ³ 0
NeighborChaged = false ÞØDmodularity ³ 0
O
D

進学後のビジョン
• 特定社会ネットワークに置いて新しい現象の
発見など
• コミュニティ発見法を使った大規模ネットワー
ク解析
– 悪質なコメントや噂の広がり、イベント発見など
• コミュニティ発見法の応用場面の拡張
– 推薦システム、予測システムなど
472016/10/2

希望動機
• 工学の観点からの社会ネットワーク解析だけ
でなく、ネットワークを通しての人と人の関係
成立から変化、人の行動、考え方などについ
て強い興味を持ってる。(例えば、会社取引関
係と発展戦略の関係、悪質なコメントなど)
• つまり、ネットワークと人間社会が密接に関
わる現代社会において、ネットワークと人間
の親和性を向上できる研究ができればいい
なと思う。
482016/10/2

希望動機
• 研究が好き。修士の段階でも、色々な研究はやってき
たが、なんの成果もなく、まだ足りない部分も多いと思
うし、私に取っては知識の累積にもっと重点があった
気がする。博士に進学して視野を広め、修士課程時
に遂行していた研究を更に深く進めたいし、今研究す
る分野に留らず幅広く勉強、研究できる環境を求めた
い。
• 研究者になりたい。もちろんいつかは自分の研究成
果に基づいた、企業を作る夢も持っている。今の自分
自身の能力だとまだまだこの目標とはかなりの距離
があると思う。博士課程での洗練により、研究能力、
視野、考え方など様々な方面で自分の目標に近づい
て行きたい。
492016/10/2

Fast community structure identification of small world networks

Recommended

Recommended

More Related Content

Similar to Fast community structure identification of small world networks

Similar to Fast community structure identification of small world networks (20)

Fast community structure identification of small world networks

Editor's Notes