This master's thesis explores designing, analyzing, and experimentally evaluating a distributed community detection algorithm. Specifically:
- A distributed version of the Louvain community detection method is developed using the Apache Spark framework. Its convergence and quality of detected communities are studied theoretically and experimentally.
- Experiments show the distributed algorithm can effectively parallelize community detection.
- Graph sampling techniques are explored for accelerating parameter selection in a resolution-limit-free community detection method. Random node selection and forest fire sampling are compared.
- Recommendations are made for choice of sampling algorithm and parameter values based on the comparison.