P2PネットワークにおけるSkip GraphとBloom Filterを用いた効率的な複数キーワード検索手法の提案

P2P ネットワークにおける Skip Graph と Bloom Filter を用いた効率的な複数キーワード検索手法の提案大阪市立大学　創造都市研究科岩本大記　石橋勇人安倍広多　松浦敏雄

背景 Peer-to-Peer ネットワーク耐障害性スケーラビリティ分散ハッシュテーブル (DHT) 大量のドキュメントを分散管理した場合 , ドキュメント全文に AND 検索をかけられない Skip Graph と Bloom Filter を利用し , ノードが保持するドキュメント全文に対し , 効率よく AND 検索を行なう手法を提案

Skip Graph [ 構造 ] Level0 Level2 Level1 2 9 5 7 3 8 6 0 000 1 001 0 100 1 011 1 101 0 111 0 010 0000 1001 0100 1011 1101 0111 0010 membership vector 2 9 5 7 3 8 6 key 2 00 00 9 00 10 5 01 00 7 01 11 8 10 01 6 10 11 3 11 01

Skip Graph [ membership vector の基数による違い ] membership vector が２進数の場合 membership vector が５進数の場合高さは平均で log w N Level0 Level1 Level2 Level3 Level4

Bloom Filter apple banana grape melon 偽陽性 0 1 2 3 4 5 6 7 - 8bit の Bloom Filter - apple banana Bloom Filter hash1( ); hash2( ); hash3( ); ハッシュ関数 : 0 5 3 2 3 6 1 5 2 2 5 3 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0

Bloom Filter の集約 apple strawberry grape banana Fruits cabbage pumpkin cucumber carrot Vegetables Fruits Bloom Filter Vegetables Bloom Filter 論理和 0 1 0 0 1 1 0 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 1 1 apple strawberry cabbage pumpkin grape banana cucumber carrot Fruits + Vegetables Bloom Filter

提案手法 [ 構造 ] Level0 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 ポインタ 3 6 8 ドキュメント Bloom Filter

提案手法 [ 構造 ] 5 7 9 6 8 Level0 Level2 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 3 2 9 5 7 8 6 2 Level3 3 5 7 8 6 9 3

Bloom Filter での AND 検索 apple Bloom Filter banana Bloom Filter Fruits Bloom Filter apple AND banana の検索 0 1 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 1 0 1 0 1 1 1 1 1 0 0 0 0 1 論理和

提案手法 [ 検索 ] apple AND banana の検索 O(log w N) w = membership vector の基数 N = ノード数 Level0 Level2 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 3 2 9 5 7 8 6 2 Level3 3 5 7 8 6 9 5 7 3 6 8 9

提案手法 [ Bloom Filter の更新 ] O( w log w N) w = membership vector の基数 N = ノード数 Level0 Level2 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 3 2 9 5 7 8 6 2 Level3 3 5 7 8 6 9 5 7 3 9

シミュレーションによる評価提案手法での複数キーワード検索時における 1. 検索ホップ数 2. Bloom Filter 更新時の更新メッセージ数測定条件ノード数： 100 〜 1000 個 membership vector の基数：２〜５ドキュメント：基本英単語 5,800 語からランダムに 300 語を　　　　　　　抽出したテキストファイル 100 個各ノードはランダムに 1 〜 5 個のドキュメントを保持 Bloom Filter ビット長： 10240 ビットハッシュ関数：４個

評価 [ ノード数に対する検索ホップ数 ] membership vector が２進数 3 進数 4 進数 5 進数検索ホップ数ノード数 ( N ) ネットワーク全体にヒットするドキュメントが１つ O(log w N)

提案手法 [ 検索ホップ数 ] Level0 Level2 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 3 2 9 5 7 8 6 2 Level3 3 5 7 8 6 9 5 7 3 6 8 9

評価 [ ヒットするドキュメント数が変化したときの検索ホップ数 ] ヒットするドキュメントの存在率 (%) 検索ホップ数ノード数が 500 のときヒットするキュメントの存在率を変化 3 進数 4 進数 5 進数 membership vector が２進数ホップ数はほとんど増加しない

評価 [ 更新時のホップ数 ] ノード数メッセージホップ数 3 進数 4 進数 5 進数更新時のメッセージホップ数 membership vector が２進数 O( w log w N)

提案手法 [ Bloom Filter の更新ホップ数 ] Level0 Level2 Level1 2 9 5 7 3 8 6 2 9 5 7 3 8 6 3 2 9 5 7 8 6 2 Level3 3 5 7 8 6 9

評価 [ 更新時のホップ数 ] ノード数メッセージホップ数 membership vector が２進数 3 進数 4 進数 5 進数更新時のメッセージホップ数基数が大きい程更新コストが大きくなる O( w log w N)

まとめ効率的な複数キーワード検索を実現 AND 検索を O(log w N) 時間で実行できる． ( w =membership vector の基数 , N = ノード数 ) 検索時間は，検索キーワードの数やドキュメントの数に依存しない．今後の課題 AND と OR を組み合わせた条件で検索ができるように拡張．大量のドキュメントがマッチした場合 , 検索結果の上限数を決める．

P2PネットワークにおけるSkip GraphとBloom Filterを用いた効率的な複数キーワード検索手法の提案

More Related Content

Similar to P2PネットワークにおけるSkip GraphとBloom Filterを用いた効率的な複数キーワード検索手法の提案

More from Kota Abe

P2PネットワークにおけるSkip GraphとBloom Filterを用いた効率的な複数キーワード検索手法の提案

Editor's Notes