The document proposes using an artificial bee colony (ABC) algorithm for community detection in networks. It begins with an introduction to the community detection problem and how it can be formulated as an optimization problem. It then describes the ABC algorithm and how solutions are represented. The ABC phases of initialization, employed bees, onlooker bees, and scout bees are explained. Various quality measures that can be optimized are provided. Experimental results on real social networks like karate club and dolphin networks show the ABC approach achieves high normalized mutual information and modularity scores. A case study on Facebook data demonstrates the ability to detect communities without prior knowledge of their number. The conclusions state ABC is effective for community detection and future work could focus on accuracy and
COMMUNITY DETECTION USING ARTIFICIAL BEE COLONY SWARM OPTIMIZATION
1. NETWORKS COMMUNITY DETECTION USING
ARTIFICIAL BEE COLONY SWARM
OPTIMIZATION
Ahmed Ibrahem Hafez, Hossam M. Zawbaa,
Aboul Ella Hassanien, Aly A. Fahmy and
Vaclav Snasel
http://www.egyptscience.ne
t
The 5th International Conference on Innovations in Bio-
Inspired Computing and Applications, June 23-25, 2014
4. Community Detection Problem
• A Community: is a group of individuals
such that they interact with each other
more frequently than with those outside
the group.
• a.k.a. group, cluster, module.
• Application
• Network compression , visualization
of a huge network .
• Can facilitate other SNA tasks.
• Social studies : understanding the
interactions between people.
5. Community Detection Problem
• Social network can be modeled as a
graph G = (V, E) consist of :
• n node/actors.
• m edges/interactions.
• Community detection is to divide the
network into k communities.
• Community detection can be treated as
an optimization problem:
• given a quality measure of communities 𝐹 𝑆
• Find a Community structure 𝑆 that max/min
𝐹 𝑆
7. Artificial Bee colony optimization ABC
• Artificial Bee colony optimization :
• A swarm based meta-heuristic that simulates foraging behavior of honey bees.
• Contains three types of bees considering the division of labor
• Employee, onlooker, and scout bees.
• The employer bees try to find food source and advertise them.
• The onlooker bees follow their interesting employer.
• The scout bee fly spontaneously to find/explore new food sources.
3. Proposed (ABC) algorithm.
8. Solution Representation
• Solution Representation : locus-based adjacency representation
• Each food source 𝑥 = {𝑥 1 , 𝑥 2 , … , 𝑥[𝑛]} consists of 𝑛 elements.
• A value j assigned to the i-th element means that nodes i and j are in the same
community
• Decoding into communities takes linear time.
• No prior knowledge about number of communities.
• Example :
• 𝑥 is decoded into 3 groups
3. Proposed (ABC) algorithm.
1
2
4
5
8 7
Group 1
Group 2
Group 3
𝑥 1 4 1 4 3 6 6 6
9. ABC Phases Details
• Initialization Phase :
• All the vectors of the population of food sources 𝑥 𝑟 are initialized by scout bees using the
following 𝑥[𝑖] 𝑟 = 𝑅𝐶(𝑖)
• 𝑅𝐶(𝑖) : is a random function that select randomly a node j form the node i’s neighbors.
• Employed Bees Phase :
• Employed bees search for new food sources 𝑣𝑟 having more nectar
• Set 𝑣𝑟 = 𝑥 𝑟
• 𝑣[𝑖] 𝑟 = 𝑥[𝑖] 𝑘 where 𝑥 𝑘 is a randomly selected food source, i is a randomly chosen parameter
index.
• Greedy selection is apply to 𝑣𝑟 and 𝑥 𝑟
3. Proposed (ABC) algorithm.
10. ABC Phases Details
• Onlooker Bees Phase :
• select a food source by watching the dances of the employee bees and try to improve this source
• Employs a probabilistic approach to choose one of the food sources and follows its employed
bee i.e. try to improve it using the same process as in the employed bee phase.
• The probability value 𝑝 𝑟 of a food source 𝑥 𝑟 is calculated as 𝑝 𝑟 =
0.9 ∗𝑓𝑖𝑡 𝑟( 𝑥 𝑟)
𝑓𝑖𝑡 𝑏𝑒𝑠𝑡
+ 0.1
• Scout Bees Phase :
• The scout bees employ a random flying pattern to discover new food source and replacing the
abandoned one with the new food source
• Using 𝑥[𝑖] 𝑟 = 𝑅𝐶(𝑖) as in the initialization phase.
3. Proposed (ABC) algorithm.
11. Objectives (Quality measures)
• Many community definitions has been proposed
• To be minimized
• Conductance: measures the fraction of total edge volume that points outside the
community.
• Expansion: measures the number of edges per node that point outside the community.
• Internal Density: is the internal edge density of the community.
• Cut Ratio : is the fraction of all possible edges leaving the community.
• Normalized Cut: is the normalized fraction of edges leaving the community.
• Maximum-ODF : (Out Degree Fraction) is the maximum fraction of edges of a node
pointing outside the community.
• Average-ODF : is the average fraction nodes' edges pointing outside the community.
• Flake-ODF: is the fraction of nodes that have fewer edges pointing inside than to the
outside of the community.
3. Proposed (ABC) algorithm.
12. Objectives (Quality measures)
• Many community definitions has been proposed
• To be maximized
• Modularity : measures the number of within-community edges, relative to a null
model of a random graph with the same degree distribution.
• Community Score: measures the density of a sub-matrices based on volume and
row/column means.
• Community Fitness: is the ratio between the total internal degrees of the nodes
belong to that community and the sum of the total internal and external degrees of the
nodes belong to that community.
3. Proposed (ABC) algorithm.
13. Dataset used for the experiment
4. Experimental Results.
• Real Social Network
• The Zachary Karate Club Network : It consists of 34 vertices and 78 edges. The network is
divided into two groups almost of the same size.
• The Bottlenose Dolphin Network: It consists of 62 bottlenose dolphins. The network split
naturally into two large groups.
• American College football Network : represents football games between American colleges
during a regular season in Fall 2000. The network is divided into 12 conferences.
• Performance Measures :
• Normalized Mutual Information (NMI) is used to measure the similarity between the true
community structures and the detected ones.
• Modularity : as a popular quality measure of community structures, we use it to calculate the
Modularity value of the detected community structures.
14. Summary of NMI values of the result for the real social network.
4. Experimental Results.
Result for Real Social NetworkAverageODF
AverageODF
AverageODF
Conductance
Conductance
Conductance
CutRatio
CutRatio
CutRatio
Expansion
Expansion
Expansion
Fitness
Fitness
Fitness
FlakeODF
FlakeODF
FlakeODF
InternalDenisty
InternalDenisty
InternalDenisty
Max-ODF
Max-ODF
Max-ODF
Modularity
Modularity
Modularity
NormalizedCut
NormalizedCut
NormalizedCut
Score
Score
Score
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Zachary Karate Bottlenose Dolphin American College football
NMI
AverageODF Conductance CutRatio Expansion Fitness FlakeODF
15. Summary of Modularity values of the result for the real social network.
4. Experimental Results.
Result for Real Social NetworkAverageODF
AverageODF
AverageODF
Conductance
Conductance
Conductance
CutRatio
CutRatio
CutRatio
Expansion
Expansion
Expansion
Fitness
Fitness
Fitness
FlakeODF
FlakeODF
FlakeODF
InternalDenisty
InternalDenisty
InternalDenisty
Max-ODF
Max-ODF
Max-ODF
Modularity
Modularity
Modularity
NormalizedCut
NormalizedCut
NormalizedCut
Score
Score
Score
Original
Original
Original
0
0.1
0.2
0.3
0.4
0.5
0.6
Zachary Karate Bottlenose Dolphin American College football
ModularityValue
AverageODF Conductance CutRatio Expansion Fitness FlakeODF
InternalDenisty Max-ODF Modularity NormalizedCut Score Original
16. Example : best result for The Zachary Karate Club
4. Experimental Results.
Result for Real Social Network
Modularity Score - Fitness
17. Example : best result for The Zachary Karate Club
4. Experimental Results.
Result for Real Social Network
Conductance Average-ODF
18. Example : best result for Bottlenose Dolphin
Network
4. Experimental Results.
Result for Real Social Network
Modularity Score
19. Example : best result for Bottlenose Dolphin
Network
4. Experimental Results.
Result for Real Social Network
Conductance Fitness
20. Case Study : Facebook dataset
4. Experimental Results.
• Online social network : a platform to build social networks and social relations
among people.
• Share interests, activities, backgrounds, or real-life connections.
• Online communities are formed where online users tend to form communities that
group users who share some common interest.
• Facebook Dataset : is undirected social network which contain 3959 nodes and
84243 edges.
• There is no clear community structure for the network
• The network is studied in term of Modularity quality measure only.
21. Case Study : Facebook dataset
4. Experimental Results.
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
Modularity
22. Conclusions
• Artificial bee colony (ABC) is an optimization technique works effectively for
the community detection problem.
• Performance is influenced directly by the objective quality function used in the
optimization process.
• Experimental results show a promising result for the proposed algorithm.
• Best result obtained using Modularity , Score and Fitness objectives
• The algorithm detects community's number automatically.
• Future work may focuses on setting some criteria for increasing the accuracy
and the scalability of algorithm.