Highlighted notes during research with Prof. Dip Sankar Banerjee, Prof. Kishore Kothapalli:
Delta-Screening: A Fast and Efficient Technique to Update Communities in Dynamic Graphs.
https://ieeexplore.ieee.org/document/9384277
There are 3 types of community detection methods:
Divisive, Agglomerative, and Multi-level (usually better).
In this paper, heuristics for skipping out most likely unaffected vertices for a modularity-based community detection method like Louvain and SLM (Smart Local Moving) is given. All edge batches are undirected, and sorted by source vertex id. For edge additions, source vertex i, highest modularity changing edge vertex j*, i's neighbors, and j*'s community are marked as affected. For edge deletions, where i and j must be in the same community, i, j, i's neighbors, and i's community are marked as affected. Performance is compared with static, dynamic baseline (incremental), and this method (both Louvain and SLM). Comparison is also done with "DynaMo" and "Batch" community detection methods.
Scalable Local Community Detection with Mapreduce for Large NetworksIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
Clustering in Aggregated User Profiles across Multiple Social Networks IJECEIAES
A social network is indeed an abstraction of related groups interacting amongst themselves to develop relationships. However, toanalyze any relationships and psychology behind it, clustering plays a vital role. Clustering enhances the predictability and discoveryof like mindedness amongst users. This article’s goal exploits the technique of Ensemble Kmeans clusters to extract the entities and their corresponding interestsas per the skills and location by aggregating user profiles across the multiple online social networks. The proposed ensemble clustering utilizes known K-means algorithm to improve results for the aggregated user profiles across multiple social networks. The approach produces an ensemble similarity measure and provides 70% better results than taking a fixed value of K or guessing a value of K while not altering the clustering method. This paper states that good ensembles clusters can be spawned to envisage the discoverability of a user for a particular interest.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
Community detection of political blogs network based on structure-attribute g...IJECEIAES
Complex networks provide means to represent different kinds of networks with multiple features. Most biological, sensor and social networks can be represented as a graph depending on the pattern of connections among their elements. The goal of the graph clustering is to divide a large graph into many clusters based on various similarity criteria’s. Political blogs as standard social dataset network, in which it can be considered as blog-blog connection, where each node has political learning beside other attributes. The main objective of work is to introduce a graph clustering method in social network analysis. The proposed Structure-Attribute Similarity (SAS-Cluster) able to detect structures of community, based on nodes similarities. The method combines topological structure with multiple characteristics of nodes, to earn the ultimate similarity. The proposed method is evaluated using well-known evaluation measures, Density, and Entropy. Finally, the presented method was compared with the state-of-art comparative method, and the results show that the proposed method is superior to the comparative method according to the evaluations measures.
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelEswar Publications
The incredible rising of online networks show that these networks are complex and involving massive data.Giving a very strong interest to set of techniques developed for mining these networks. The clique problem is a well known NP-Hard problem in graph mining. One of the fundamental applications for it is the community detection. It helps to understand and model the network structure which has been a fundamental problem in several fields. In literature, the exponentially increasing computation time of this problem make the quality of these solutions is limited and infeasible for massive graphs. Furthermore, most of the proposed approaches are able to detect only disjoint communities. In this paper, we present a new clique based approach for fast and efficient overlapping
community detection. The work overcomes the short falls of clique percolation method (CPM), one of most popular and commonly used methods in this area. The shortfalls occur due to brute force algorithm for enumerating maximal cliques and also the missing out many vertices thatleads to poor node coverage. The proposed work overcome these shortfalls producing NMC method for enumerating maximal cliques then detects overlapping communities using three different community scales based on three different depth levels to assure high nodes coverage and detects the largest communities. The clustering coefficient and cluster density are used to measure the quality. The work also provide experimental results on benchmark real world network to
demonstrate the efficiency and compare the new proposed algorithm with CPM method, The proposed algorithm is able to quickly discover the maximal cliques and detects overlapping community with interesting remarks and findings.
Scalable Local Community Detection with Mapreduce for Large NetworksIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
Clustering in Aggregated User Profiles across Multiple Social Networks IJECEIAES
A social network is indeed an abstraction of related groups interacting amongst themselves to develop relationships. However, toanalyze any relationships and psychology behind it, clustering plays a vital role. Clustering enhances the predictability and discoveryof like mindedness amongst users. This article’s goal exploits the technique of Ensemble Kmeans clusters to extract the entities and their corresponding interestsas per the skills and location by aggregating user profiles across the multiple online social networks. The proposed ensemble clustering utilizes known K-means algorithm to improve results for the aggregated user profiles across multiple social networks. The approach produces an ensemble similarity measure and provides 70% better results than taking a fixed value of K or guessing a value of K while not altering the clustering method. This paper states that good ensembles clusters can be spawned to envisage the discoverability of a user for a particular interest.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
Community detection of political blogs network based on structure-attribute g...IJECEIAES
Complex networks provide means to represent different kinds of networks with multiple features. Most biological, sensor and social networks can be represented as a graph depending on the pattern of connections among their elements. The goal of the graph clustering is to divide a large graph into many clusters based on various similarity criteria’s. Political blogs as standard social dataset network, in which it can be considered as blog-blog connection, where each node has political learning beside other attributes. The main objective of work is to introduce a graph clustering method in social network analysis. The proposed Structure-Attribute Similarity (SAS-Cluster) able to detect structures of community, based on nodes similarities. The method combines topological structure with multiple characteristics of nodes, to earn the ultimate similarity. The proposed method is evaluated using well-known evaluation measures, Density, and Entropy. Finally, the presented method was compared with the state-of-art comparative method, and the results show that the proposed method is superior to the comparative method according to the evaluations measures.
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelEswar Publications
The incredible rising of online networks show that these networks are complex and involving massive data.Giving a very strong interest to set of techniques developed for mining these networks. The clique problem is a well known NP-Hard problem in graph mining. One of the fundamental applications for it is the community detection. It helps to understand and model the network structure which has been a fundamental problem in several fields. In literature, the exponentially increasing computation time of this problem make the quality of these solutions is limited and infeasible for massive graphs. Furthermore, most of the proposed approaches are able to detect only disjoint communities. In this paper, we present a new clique based approach for fast and efficient overlapping
community detection. The work overcomes the short falls of clique percolation method (CPM), one of most popular and commonly used methods in this area. The shortfalls occur due to brute force algorithm for enumerating maximal cliques and also the missing out many vertices thatleads to poor node coverage. The proposed work overcome these shortfalls producing NMC method for enumerating maximal cliques then detects overlapping communities using three different community scales based on three different depth levels to assure high nodes coverage and detects the largest communities. The clustering coefficient and cluster density are used to measure the quality. The work also provide experimental results on benchmark real world network to
demonstrate the efficiency and compare the new proposed algorithm with CPM method, The proposed algorithm is able to quickly discover the maximal cliques and detects overlapping community with interesting remarks and findings.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
A short overview of current quantum computing hardware and some of the machine learning algorithms that have been developed on these systems for machine learning tasks.
Survey of Data Mining Techniques on Crime Data Analysisijdmtaiir
-Data mining is a process of extracting knowledge
from huge amount of data stored in databases, data warehouses
and data repositories. Crime is an interesting application where
data mining plays an important role in terms of prediction and
analysis. Clustering is the process of combining data objects
into groups. The data objects within the group are very similar
and very dissimilar as well when compared to objects of other
groups. This paper presents detailed study on clustering
techniques and its role on crime applications. This study also
helps crime branch for better prediction and classification of
crimes.
Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large G...Gabriela Agustini
A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs — these range from sets of social groups, events, or collabora- tion projects to the vast collection of graph neighborhoods in large social networks. A natural question is how to usefully define a domain-independent ‘coordinate system’ for such a collection of graphs, so that the set of possible structures can be compactly rep- resented and understood within a common space. In this work, we draw on the theory of graph homomorphisms to formulate and an- alyze such a representation, based on computing the frequencies of small induced subgraphs within each graph.
New kind of intrusions causes deviation in the normal behaviour of traffic flow in
computer networks every day. This study focused on enhancing the learning capabilities of IDS
to detect the anomalies present in a network traffic flow by comparing the k-means approach of
data mining for intrusion detection and the outlier detection approach. The k-means approach
uses clustering mechanisms to group the traffic flow data into normal and abnormal clusters.
Outlier detection calculates an outlier score (neighbourhood outlier factor (NOF)) for each flow
record, whose value decides whether a traffic flow is normal or abnormal. These two methods
were then compared in terms of various performance metrics and the amount of computer
resources consumed by them. Overall, k-means was more accurate and precise and has better
classification rate than outlier detection in intrusion detection using traffic flows. This will help
systems administrators in their choice of IDS.
The world has many natural systems that are so complex to be understood easily. This creates a need to have simple
principles or systems that capture the complexity of the world. The simple systems make it easier for many people to
understand the world by representing the complex world in a more straightforward way (Stefan, 2003). Many objects
and projects are seen to be a network of processes or substances. Graphs and networks have been used widely in
different projects for different reasons by project managers mostly. There are techniques such as critical path analysis
that make use of graphs and networks and are applied by project managers and all the staff involved in projects. These
methods are used to ensure smooth planning and control of projects. However, the techniques have to be applied
correctly to achieve the desired objective. This paper looks at the impact of graphs and networks in minimizing the costs
of a project or product. From this research, it can be inferred that the techniques such as critical path method, that make
use of graphs and networks, play a significant role in determining and hence reducing the product cost. This is done by
making the right decisions regarding the resources and time most appropriate for a project. The paper shows clearly
how these techniques are applied in a project to determine project duration and hence minimize the cost.
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
Cluster analysis of graph related problems is an important issue now-a-day. Different types of graph
clustering techniques are appeared in the field but most of them are vulnerable in terms of effectiveness
and fragmentation of output in case of real-world applications in diverse systems. In this paper, we will
provide a comparative behavioural analysis of RNSC (Restricted Neighbourhood Search Clustering) and
MCL (Markov Clustering) algorithms on Power-Law Distribution graphs. RNSC is a graph clustering
technique using stochastic local search. RNSC algorithm tries to achieve optimal cost clustering by
assigning some cost functions to the set of clusterings of a graph. This algorithm was implemented by A.
D. King only for undirected and unweighted random graphs. Another popular graph clustering
algorithm MCL is based on stochastic flow simulation model for weighted graphs. There are plentiful
applications of power-law or scale-free graphs in nature and society. Scale-free topology is stochastic i.e.
nodes are connected in a random manner. Complex network topologies like World Wide Web, the web of
human sexual contacts, or the chemical network of a cell etc., are basically following power-law
distribution to represent different real-life systems. This paper uses real large-scale power-law
distribution graphs to conduct the performance analysis of RNSC behaviour compared with Markov
clustering (MCL) algorithm. Extensive experimental results on several synthetic and real power-law
distribution datasets reveal the effectiveness of our approach to comparative performance measure of
these algorithms on the basis of cost of clustering, cluster size, modularity index of clustering results and
normalized mutual information (NMI).
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Multi-objective NSGA-II based community detection using dynamical evolution s...IJECEIAES
Community detection is becoming a highly demanded topic in social networking-based applications. It involves finding the maximum intraconnected and minimum inter-connected sub-graphs in given social networks. Many approaches have been developed for community’s detection and less of them have focused on the dynamical aspect of the social network. The decision of the community has to consider the pattern of changes in the social network and to be smooth enough. This is to enable smooth operation for other community detection dependent application. Unlike dynamical community detection Algorithms, this article presents a non-dominated aware searching Algorithm designated as non-dominated sorting based community detection with dynamical awareness (NDS-CD-DA). The Algorithm uses a non-dominated sorting genetic algorithm NSGA-II with two objectives: modularity and normalized mutual information (NMI). Experimental results on synthetic networks and real-world social network datasets have been compared with classical genetic with a single objective and has been shown to provide superiority in terms of the domination as well as the convergence. NDS-CD-DA has accomplished a domination percentage of 100% over dynamic evolutionary community searching DECS for almost all iterations.
Incremental Community Mining in Location-based Social NetworkIJAEMSJORNAL
A social network can be defined as a set of social entities connected by a set of social relations. These relations often change and differ in time. Thus, the fundamental structure of these networks is dynamic and increasingly developing. Investigating how the structure of these networks evolves over the observation time affords visions into their evolution structure, elements that initiate the changes, and finally foresee the future structure of these networks. One of the most relevant properties of networks is their community structure – set of vertices highly connected between each other and loosely connected with the rest of the network. Subsequently networks are dynamic, their underlying community structure changes over time as well, i.e they have social entities that appear and disappear which make their communities shrinking and growing over time. The goal of this paper is to study community detection in dynamic social network in the context of location-based social network. In this respect, we extend the static Louvain method to incrementally detect communities in a dynamic scenario following the direct method and considering both overlapping and non-overlapping setting. Finally, extensive experiments on real datasets and comparison with two previous methods demonstrate the effectiveness and potential of our suggested method.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
A short overview of current quantum computing hardware and some of the machine learning algorithms that have been developed on these systems for machine learning tasks.
Survey of Data Mining Techniques on Crime Data Analysisijdmtaiir
-Data mining is a process of extracting knowledge
from huge amount of data stored in databases, data warehouses
and data repositories. Crime is an interesting application where
data mining plays an important role in terms of prediction and
analysis. Clustering is the process of combining data objects
into groups. The data objects within the group are very similar
and very dissimilar as well when compared to objects of other
groups. This paper presents detailed study on clustering
techniques and its role on crime applications. This study also
helps crime branch for better prediction and classification of
crimes.
Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large G...Gabriela Agustini
A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs — these range from sets of social groups, events, or collabora- tion projects to the vast collection of graph neighborhoods in large social networks. A natural question is how to usefully define a domain-independent ‘coordinate system’ for such a collection of graphs, so that the set of possible structures can be compactly rep- resented and understood within a common space. In this work, we draw on the theory of graph homomorphisms to formulate and an- alyze such a representation, based on computing the frequencies of small induced subgraphs within each graph.
New kind of intrusions causes deviation in the normal behaviour of traffic flow in
computer networks every day. This study focused on enhancing the learning capabilities of IDS
to detect the anomalies present in a network traffic flow by comparing the k-means approach of
data mining for intrusion detection and the outlier detection approach. The k-means approach
uses clustering mechanisms to group the traffic flow data into normal and abnormal clusters.
Outlier detection calculates an outlier score (neighbourhood outlier factor (NOF)) for each flow
record, whose value decides whether a traffic flow is normal or abnormal. These two methods
were then compared in terms of various performance metrics and the amount of computer
resources consumed by them. Overall, k-means was more accurate and precise and has better
classification rate than outlier detection in intrusion detection using traffic flows. This will help
systems administrators in their choice of IDS.
The world has many natural systems that are so complex to be understood easily. This creates a need to have simple
principles or systems that capture the complexity of the world. The simple systems make it easier for many people to
understand the world by representing the complex world in a more straightforward way (Stefan, 2003). Many objects
and projects are seen to be a network of processes or substances. Graphs and networks have been used widely in
different projects for different reasons by project managers mostly. There are techniques such as critical path analysis
that make use of graphs and networks and are applied by project managers and all the staff involved in projects. These
methods are used to ensure smooth planning and control of projects. However, the techniques have to be applied
correctly to achieve the desired objective. This paper looks at the impact of graphs and networks in minimizing the costs
of a project or product. From this research, it can be inferred that the techniques such as critical path method, that make
use of graphs and networks, play a significant role in determining and hence reducing the product cost. This is done by
making the right decisions regarding the resources and time most appropriate for a project. The paper shows clearly
how these techniques are applied in a project to determine project duration and hence minimize the cost.
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
Cluster analysis of graph related problems is an important issue now-a-day. Different types of graph
clustering techniques are appeared in the field but most of them are vulnerable in terms of effectiveness
and fragmentation of output in case of real-world applications in diverse systems. In this paper, we will
provide a comparative behavioural analysis of RNSC (Restricted Neighbourhood Search Clustering) and
MCL (Markov Clustering) algorithms on Power-Law Distribution graphs. RNSC is a graph clustering
technique using stochastic local search. RNSC algorithm tries to achieve optimal cost clustering by
assigning some cost functions to the set of clusterings of a graph. This algorithm was implemented by A.
D. King only for undirected and unweighted random graphs. Another popular graph clustering
algorithm MCL is based on stochastic flow simulation model for weighted graphs. There are plentiful
applications of power-law or scale-free graphs in nature and society. Scale-free topology is stochastic i.e.
nodes are connected in a random manner. Complex network topologies like World Wide Web, the web of
human sexual contacts, or the chemical network of a cell etc., are basically following power-law
distribution to represent different real-life systems. This paper uses real large-scale power-law
distribution graphs to conduct the performance analysis of RNSC behaviour compared with Markov
clustering (MCL) algorithm. Extensive experimental results on several synthetic and real power-law
distribution datasets reveal the effectiveness of our approach to comparative performance measure of
these algorithms on the basis of cost of clustering, cluster size, modularity index of clustering results and
normalized mutual information (NMI).
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Multi-objective NSGA-II based community detection using dynamical evolution s...IJECEIAES
Community detection is becoming a highly demanded topic in social networking-based applications. It involves finding the maximum intraconnected and minimum inter-connected sub-graphs in given social networks. Many approaches have been developed for community’s detection and less of them have focused on the dynamical aspect of the social network. The decision of the community has to consider the pattern of changes in the social network and to be smooth enough. This is to enable smooth operation for other community detection dependent application. Unlike dynamical community detection Algorithms, this article presents a non-dominated aware searching Algorithm designated as non-dominated sorting based community detection with dynamical awareness (NDS-CD-DA). The Algorithm uses a non-dominated sorting genetic algorithm NSGA-II with two objectives: modularity and normalized mutual information (NMI). Experimental results on synthetic networks and real-world social network datasets have been compared with classical genetic with a single objective and has been shown to provide superiority in terms of the domination as well as the convergence. NDS-CD-DA has accomplished a domination percentage of 100% over dynamic evolutionary community searching DECS for almost all iterations.
Incremental Community Mining in Location-based Social NetworkIJAEMSJORNAL
A social network can be defined as a set of social entities connected by a set of social relations. These relations often change and differ in time. Thus, the fundamental structure of these networks is dynamic and increasingly developing. Investigating how the structure of these networks evolves over the observation time affords visions into their evolution structure, elements that initiate the changes, and finally foresee the future structure of these networks. One of the most relevant properties of networks is their community structure – set of vertices highly connected between each other and loosely connected with the rest of the network. Subsequently networks are dynamic, their underlying community structure changes over time as well, i.e they have social entities that appear and disappear which make their communities shrinking and growing over time. The goal of this paper is to study community detection in dynamic social network in the context of location-based social network. In this respect, we extend the static Louvain method to incrementally detect communities in a dynamic scenario following the direct method and considering both overlapping and non-overlapping setting. Finally, extensive experiments on real datasets and comparison with two previous methods demonstrate the effectiveness and potential of our suggested method.
SCALABLE LOCAL COMMUNITY DETECTION WITH MAPREDUCE FOR LARGE NETWORKSIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESSubhajit Sahu
A **community** (in a network) is a subset of nodes which are _strongly connected among themselves_, but _weakly connected to others_. Neither the number of output communities nor their size distribution is known a priori. Community detection methods can be divisive or agglomerative. **Divisive methods** use _betweeness centrality_ to **identify and remove bridges** between communities. **Agglomerative methods** greedily **merge two communities** that provide maximum gain in _modularity_. Newman and Girvan have introduced the **modularity metric**. The problem of community detection is then reduced to the problem of modularity maximization which is **NP-complete**. **Louvain method** is a variant of the _agglomerative strategy_, in that is a _multi-level heuristic_.
https://gist.github.com/wolfram77/917a1a4a429e89a0f2a1911cea56314d
In this paper, the authors discuss **four heuristics** for Community detection using the _Louvain algorithm_ implemented upon recently developed **Grappolo**, which is a parallel variant of the Louvain algorithm. They are:
- Vertex following and Minimum label
- Data caching
- Graph coloring
- Threshold scaling
With the **Vertex following** heuristic, the _input is preprocessed_ and all single-degree vertices are merged with their corresponding neighbours. This helps reduce the number of vertices considered in each iteration, and also help initial seeds of communities to be formed. With the **Minimum label heuristic**, when a vertex is making the decision to move to a community and multiple communities provided the same modularity gain, the community with the smallest id is chosen. This helps _minimize or prevent community swaps_. With the **Data caching** heuristic, community information is stored in a vector instead of a map, and is reused in each iteration, but with some additional cost. With the **Vertex ordering via Graph coloring** heuristic, _distance-k coloring_ of graphs is performed in order to group vertices into colors. Then, each set of vertices (by color) is processed _concurrently_, and synchronization is performed after that. This enables us to mimic the behaviour of the serial algorithm. Finally, with the **Threshold scaling** heuristic, _successively smaller values of modularity threshold_ are used as the algorithm progresses. This allows the algorithm to converge faster, and it has been observed a good modularity score as well.
From the results, it appears that _graph coloring_ and _threshold scaling_ heuristics do not always provide a speedup and this depends upon the nature of the graph. It would be interesting to compare the heuristics against baseline approaches. Future work can include _distributed memory implementations_, and _community detection on streaming graphs_.
A Survey of Community Detection Approaches From Statistical Modeling to Deep ...OKOKPROJECTS
https://okokprojects.com/
IEEE PROJECTS 2023-2024 TITLE LIST
WhatsApp : +91-8144199666
From Our Title List the Cost will be,
Mail Us: okokprojects@gmail.com
Website: : https://www.okokprojects.com
: http://www.ieeeproject.net
Support Including Packages
=======================
* Complete Source Code
* Complete Documentation
* Complete Presentation Slides
* Flow Diagram
* Database File
* Screenshots
* Execution Procedure
* Video Tutorials
* Supporting Softwares
Support Specialization
=======================
* 24/7 Support
* Ticketing System
* Voice Conference
* Video On Demand
* Remote Connectivity
* Document Customization
* Live Chat Support
Analysis and assessment software for multi-user collaborative cognitive radi...IJECEIAES
Computer simulations are without a doubt a useful methodology that allows to explore research queries and develop prototypes at lower costs and timeframes than those required in hardware processes. The simulation tools used in cognitive radio networks (CRN) are undergoing an active process. Currently, there is no stable simulator that enables to characterize every element of the cognitive cycle and the available tools are a framework for discrete-event software. This work presents the spectral mobility simulator in CRN called “App MultiColl-DCRN”, developed with MATLAB’s app designer. In contrast with other frameworks, the simulator uses real spectral occupancy data and simultaneously analyzes features regarding spectral mobility, decision-making, multi-user access, collaborative scenarios and decentralized architectures. Performance metrics include bandwidth, throughput level, number of failed handoffs, number of total handoffs, number of handoffs with interference, number of anticipated handoffs and number of perfect handoffs. The assessment of the simulator involves three scenarios: the first and second scenarios present a collaborative structure using the multi-criteria optimization and compromise solution (VIKOR) decision-making model and the naïve Bayes prediction technique respectively. The third scenario presents a multi-user structure and uses simple additive weighting (SAW) as a decision-making technique. The present development represents a contribution in the cognitive radio network field since there is currently no software with the same features.
Geo community-based broadcasting for data dissemination in mobile social netw...IEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
AN GROUP BEHAVIOR MOBILITY MODEL FOR OPPORTUNISTIC NETWORKS csandit
Mobility is regarded as a network transport mechanism for distributing data in many networks.
However, many mobility models ignore the fact that peer nodes often carried by people and
thus move in group pattern according to some kind of social relation. In this paper, we propose
one mobility model based on group behavior character which derives from real movement
scenario in daily life. This paper also gives the character analysis of this mobility model and
compares with the classic Random Waypoint Mobility model.
Trend-Based Networking Driven by Big Data Telemetry for Sdn and Traditional N...josephjonse
Organizations face a challenge of accurately analyzing network data and providing automated action based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and improve the performance of the network services, but organizations use different network management tools to understand and visualize the network traffic with limited abilities to dynamically optimize the network. This research focuses on the development of an intelligent system that leverages big data telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web application for effortless management of all subsystems, and the system and application developed in this research demonstrate the true potential for a scalable system capable of effectively benchmarking the network to set the expected behavior for comparison and trend analysis. Moreover, this research provides a proof of concept of how trend analysis results are actioned in both a traditional network and a software-defined network (SDN) to achieve dynamic, automated load balancing.
TREND-BASED NETWORKING DRIVEN BY BIG DATA TELEMETRY FOR SDN AND TRADITIONAL N...ijngnjournal
Organizations face a challenge of accurately analyzing network data and providing automated action
based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and
improve the performance of the network services, but organizations use different network management
tools to understand and visualize the network traffic with limited abilities to dynamically optimize the
network. This research focuses on the development of an intelligent system that leverages big data
telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web
application for effortless management of all subsystems, and the system and application developed in
this research demonstrate the true potential for a scalable system capable of effectively benchmarking
the network to set the expected behavior for comparison and trend analysis. Moreover, this research
provides a proof of concept of how trend analysis results are actioned in both a traditional network and
a software-defined network (SDN) to achieve dynamic, automated load balancing.
Trend-Based Networking Driven by Big Data Telemetry for Sdn and Traditional N...josephjonse
Organizations face a challenge of accurately analyzing network data and providing automated action based on the observed trend. This trend-based analytics is beneficial to minimize the downtime and improve the performance of the network services, but organizations use different network management tools to understand and visualize the network traffic with limited abilities to dynamically optimize the network. This research focuses on the development of an intelligent system that leverages big data telemetry analysis in Platform for Network Data Analytics (PNDA) to enable comprehensive trendbased networking decisions. The results include a graphical user interface (GUI) done via a web application for effortless management of all subsystems, and the system and application developed in this research demonstrate the true potential for a scalable system capable of effectively benchmarking the network to set the expected behavior for comparison and trend analysis. Moreover, this research provides a proof of concept of how trend analysis results are actioned in both a traditional network and a software-defined network (SDN) to achieve dynamic, automated load balancing
Web Graph Clustering Using Hyperlink Structureaciijournal
Now, information is useful for every environment in which time similarity is more important case. The most
of people are strongly interested in Internet. Web pages in the Internet are linked thorough hyperlinks that
contain useful information. By using hyperlinks, web graphs are constructed for time similarity web links in
which webs have been seen by users at past. These activities are needed to use for tracing who used the
websites for something at the time. So this paper provides the history of users who connect the person
started the news. We found that the normalized-cut method with the new similarity metric is particularly
effective, as demonstrated on a web log file
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
Clustering is the process of grouping a set of objects
into classes of similar objects. Dynamic clustering comes in a
new research area that is concerned about dataset with dynamic
aspects. It requires updates of the clusters whenever new data
records are added to the dataset and may result in a change of
clustering over time. When there is a continuous update and
huge amount of dynamic data, rescan the database is not
possible in static data mining. But this is possible in Dynamic
data mining process. This dynamic data mining occurs when
the derived information is present for the purpose of analysis
and the environment is dynamic, i.e. many updates occur.
Since this has now been established by most researchers and
they will move into solving some of the problems and the
research is to concentrate on solving the problem of using data
mining dynamic databases. This paper gives some
investigation of existing work done in some papers related with
dynamic clustering and incremental data clustering
Similar to Delta-Screening: A Fast and Efficient Technique to Update Communities in Dynamic Graphs : NOTES (20)
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...Subhajit Sahu
TrueTime is a service that enables the use of globally synchronized clocks, with bounded error. It returns a time interval that is guaranteed to contain the clock’s actual time for some time during the call’s execution. If two intervals do not overlap, then we know calls were definitely ordered in real time. In general, synchronized clocks can be used to avoid communication in a distributed system.
The underlying source of time is a combination of GPS receivers and atomic clocks. As there are “time masters” in every datacenter (redundantly), it is likely that both sides of a partition would continue to enjoy accurate time. Individual nodes however need network connectivity to the masters, and without it their clocks will drift. Thus, during a partition their intervals slowly grow wider over time, based on bounds on the rate of local clock drift. Operations depending on TrueTime, such as Paxos leader election or transaction commits, thus have to wait a little longer, but the operation still completes (assuming the 2PC and quorum communication are working).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting Bitset for graph : SHORT REPORT / NOTESSubhajit Sahu
Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is commonly used for efficient graph computations. Unfortunately, using CSR for dynamic graphs is impractical since addition/deletion of a single edge can require on average (N+M)/2 memory accesses, in order to update source-offsets and destination-indices. A common approach is therefore to store edge-lists/destination-indices as an array of arrays, where each edge-list is an array belonging to a vertex. While this is good enough for small graphs, it quickly becomes a bottleneck for large graphs. What causes this bottleneck depends on whether the edge-lists are sorted or unsorted. If they are sorted, checking for an edge requires about log(E) memory accesses, but adding an edge on average requires E/2 accesses, where E is the number of edges of a given vertex. Note that both addition and deletion of edges in a dynamic graph require checking for an existing edge, before adding or deleting it. If edge lists are unsorted, checking for an edge requires around E/2 memory accesses, but adding an edge requires only 1 memory access.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Experiments with Primitive operations : SHORT REPORT / NOTESSubhajit Sahu
This includes:
- Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
- Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
- Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
- Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
Below are the important points I note from the 2020 paper by Martin Grohe:
- 1-WL distinguishes almost all graphs, in a probabilistic sense
- Classical WL is two dimensional Weisfeiler-Leman
- DeepWL is an unlimited version of WL graph that runs in polynomial time.
- Knowledge graphs are essentially graphs with vertex/edge attributes
ABSTRACT:
Vector representations of graphs and relational structures, whether handcrafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view.
Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
https://gist.github.com/wolfram77/54c4a14d9ea547183c6c7b3518bf9cd1
There exist a number of dynamic graph generators. Barbasi-Albert model iteratively attach new vertices to pre-exsiting vertices in the graph using preferential attachment (edges to high degree vertices are more likely - rich get richer - Pareto principle). However, graph size increases monotonically, and density of graph keeps increasing (sparsity decreasing).
Gorke's model uses a defined clustering to uniformly add vertices and edges. Purohit's model uses motifs (eg. triangles) to mimick properties of existing dynamic graphs, such as growth rate, structure, and degree distribution. Kronecker graph generators are used to increase size of a given graph, with power-law distribution.
To generate dynamic graphs, we must choose a metric to compare two graphs. Common metrics include diameter, clustering coefficient (modularity?), triangle counting (triangle density?), and degree distribution.
In this paper, the authors propose Dygraph, a dynamic graph generator that uses degree distribution as the only metric. The authors observe that many real-world graphs differ from the power-law distribution at the tail end. To address this issue, they propose binning, where the vertices beyond a certain degree (minDeg = min(deg) s.t. |V(deg)| < H, where H~10 is the number of vertices with a given degree below which are binned) are grouped into bins of degree-width binWidth, max-degree localMax, and number of degrees in bin with at least one vertex binSize (to keep track of sparsity). This helps the authors to generate graphs with a more realistic degree distribution.
The process of generating a dynamic graph is as follows. First the difference between the desired and the current degree distribution is calculated. The authors then create an edge-addition set where each vertex is present as many times as the number of additional incident edges it must recieve. Edges are then created by connecting two vertices randomly from this set, and removing both from the set once connected. Currently, authors reject self-loops and duplicate edges. Removal of edges is done in a similar fashion.
Authors observe that adding edges with power-law properties dominates the execution time, and consider parallelizing DyGraph as part of future work.
My notes on shared memory parallelism.
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Using memory for communication inside a single program, e.g. among its multiple threads, is also referred to as shared memory [REF].
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESSubhajit Sahu
**Community detection methods** can be *global* or *local*. **Global community detection methods** divide the entire graph into groups. Existing global algorithms include:
- Random walk methods
- Spectral partitioning
- Label propagation
- Greedy agglomerative and divisive algorithms
- Clique percolation
https://gist.github.com/wolfram77/b4316609265b5b9f88027bbc491f80b6
There is a growing body of work in *detecting overlapping communities*. **Seed set expansion** is a **local community detection method** where a relevant *seed vertices* of interest are picked and *expanded to form communities* surrounding them. The quality of each community is measured using a *fitness function*.
**Modularity** is a *fitness function* which compares the number of intra-community edges to the expected number in a random-null model. **Conductance** is another popular fitness score that measures the community cut or inter-community edges. Many *overlapping community detection* methods **use a modified ratio** of intra-community edges to all edges with atleast one endpoint in the community.
Andersen et al. use a **Spectral PageRank-Nibble method** which minimizes conductance and is formed by adding vertices in order of decreasing PageRank values. Andersen and Lang develop a **random walk approach** in which some vertices in the seed set may not be placed in the final community. Clauset gives a **greedy method** that *starts from a single vertex* and then iteratively adds neighboring vertices *maximizing the local modularity score*. Riedy et al. **expand multiple vertices** via maximizing modularity.
Several algorithms for **detecting global, overlapping communities** use a *greedy*, *agglomerative approach* and run *multiple separate seed set expansions*. Lancichinetti et al. run **greedy seed set expansions**, each with a *single seed vertex*. Overlapping communities are produced by a sequentially running expansions from a node not yet in a community. Lee et al. use **maximal cliques as seed sets**. Havemann et al. **greedily expand cliques**.
The authors of this paper discuss a dynamic approach for **community detection using seed set expansion**. Simply marking the neighbours of changed vertices is a **naive approach**, and has *severe shortcomings*. This is because *communities can split apart*. The simple updating method *may fail even when it outputs a valid community* in the graph.
Application Areas of Community Detection: A Review : NOTESSubhajit Sahu
This is a short review of Community detection methods (on graphs), and their applications. A **community** is a subset of a network whose members are *highly connected*, but *loosely connected* to others outside their community. Different community detection methods *can return differing communities* these algorithms are **heuristic-based**. **Dynamic community detection** involves tracking the *evolution of community structure* over time.
https://gist.github.com/wolfram77/09e64d6ba3ef080db5558feb2d32fdc0
Communities can be of the following **types**:
- Disjoint
- Overlapping
- Hierarchical
- Local.
The following **static** community detection **methods** exist:
- Spectral-based
- Statistical inference
- Optimization
- Dynamics-based
The following **dynamic** community detection **methods** exist:
- Independent community detection and matching
- Dependent community detection (evolutionary)
- Simultaneous community detection on all snapshots
- Dynamic community detection on temporal networks
**Applications** of community detection include:
- Criminal identification
- Fraud detection
- Criminal activities detection
- Bot detection
- Dynamics of epidemic spreading (dynamic)
- Cancer/tumor detection
- Tissue/organ detection
- Evolution of influence (dynamic)
- Astroturfing
- Customer segmentation
- Recommendation systems
- Social network analysis (both)
- Network summarization
- Privary, group segmentation
- Link prediction (both)
- Community evolution prediction (dynamic, hot field)
<br>
<br>
## References
- [Application Areas of Community Detection: A Review : PAPER](https://ieeexplore.ieee.org/document/8625349)
This paper discusses a GPU implementation of the Louvain community detection algorithm. Louvain algorithm obtains hierachical communities as a dendrogram through modularity optimization. Given an undirected weighted graph, all vertices are first considered to be their own communities. In the first phase, each vertex greedily decides to move to the community of one of its neighbours which gives greatest increase in modularity. If moving to no neighbour's community leads to an increase in modularity, the vertex chooses to stay with its own community. This is done sequentially for all the vertices. If the total change in modularity is more than a certain threshold, this phase is repeated. Once this local moving phase is complete, all vertices have formed their first hierarchy of communities. The next phase is called the aggregation phase, where all the vertices belonging to a community are collapsed into a single super-vertex, such that edges between communities are represented as edges between respective super-vertices (edge weights are combined), and edges within each community are represented as self-loops in respective super-vertices (again, edge weights are combined). Together, the local moving and the aggregation phases constitute a stage. This super-vertex graph is then used as input fof the next stage. This process continues until the increase in modularity is below a certain threshold. As a result from each stage, we have a hierarchy of community memberships for each vertex as a dendrogram.
Approaches to perform the Louvain algorithm can be divided into coarse-grained and fine-grained. Coarse-grained approaches process a set of vertices in parallel, while fine-grained approaches process all vertices in parallel. A coarse-grained hybrid-GPU algorithm using multi GPUs has be implemented by Cheong et al. which grabbed my attention. In addition, their algorithm does not use hashing for the local moving phase, but instead sorts each neighbour list based on the community id of each vertex.
https://gist.github.com/wolfram77/7e72c9b8c18c18ab908ae76262099329
Survey for extra-child-process package : NOTESSubhajit Sahu
Useful additions to inbuilt child_process module.
📦 Node.js, 📜 Files, 📰 Docs.
Please see attached PDF for literature survey.
https://gist.github.com/wolfram77/d936da570d7bf73f95d1513d4368573e
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERSubhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/692d263f463fd49be6eb5aa65dd4d0f9
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Subhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024
Fast Incremental Community Detection on Dynamic Graphs : NOTESSubhajit Sahu
In this paper, the authors describe two approaches for dynamic community detection using the CNM algorithm. CNM is a hierarchical, agglomerative algorithm that greedily maximizes modularity. They define two approaches: BasicDyn and FastDyn. BasicDyn backtracks merges of communities until each marked (changed) vertex is its own singleton community. FastDyn undoes a merge only if the quality of merge, as measured by the induced change in modularity, has significantly decreased compared to when the merge initially took place. FastDyn also allows more than two vertices to contract together if in the previous time step these vertices eventually ended up contracted in the same community. In the static case, merging several vertices together in one contraction phase could lead to deteriorating results. FastDyn is able to do this, however, because it uses information from the merges of the previous time step. Intuitively, merges that previously occurred are more likely to be acceptable later.
https://gist.github.com/wolfram77/1856b108334cc822cdddfdfa7334792a
Can you fix farming by going back 8000 years : NOTESSubhajit Sahu
1. Human population didn't explode, but plateued.
2. Fertilizer prices are going to the sky.
3. Farmers are looking for alternatives such as animal waste (manure) or even human waste.
4. Manure prices are also going up.
5. Switching to organic farming not an option.
https://gist.github.com/wolfram77/49067fc3ddc1ba2e1db4f873056fd88a
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
Delta-Screening: A Fast and Efficient Technique to Update Communities in Dynamic Graphs : NOTES
1. Delta-Screening: A Fast and Efficient Technique to
Update Communities in Dynamic Graphs
Neda Zarayeneh , Member, IEEE and Ananth Kalyanaraman , Senior Member, IEEE
Abstract—Detecting communities in time-evolving/dynamic
networks is an important operation used in many real-world
network science applications. While there have been several
proposed strategies for dynamic community detection, such
approaches do not necessarily take advantage of the locality of
changes. In this paper, we present a new technique called Delta-
Screening (or simply, D-screening) for updating communities in a
dynamic graph. The technique assumes that the graph is given as
a series of time steps, and outputs a set of communities for each
time step. At the start of each time step, the D-screening
technique examines all changes (edge additions and deletions)
and computes a subset of vertices that are likely to be impacted
by the change (using the modularity objective). Subsequently,
only the identified subsets are processed for community state
updates. Our experiments demonstrate that this scheme, despite
its ability to prune vertices aggressively, is able to generate
significant savings in runtime performance (up to 38 speedup
over static baseline and 5 over dynamic baseline
implementations), without compromising on the quality. We test
on both real-world and synthetic network inputs containing both
edge additions and deletions. The D-screening technique is
generic to be incorporated into any of the existing modularity-
optimizing clustering algorithms. We tested using two state-of-
the-art clustering implementations, namely, Louvain and SLM.
In addition, we also show how to use the D-screening approach to
delineate appropriate intervals of temporal resolutions at which
to analyze a given input network.
Index Terms—Community detection, dynamic graphs, incre-
mental clustering, modularity optimization.
I. INTRODUCTION
COMMUNITY detection is a fundamental problem in
many graph applications. The goal of community detec-
tion is to identify tightly-knit groups of vertices in an input
network, such that the members of each “community” share a
higher concentration of edges among them than to the rest of
the network. Owing to its ability to reveal natural divisions
that may exist in a network (in an unsupervised manner), com-
munity detection has become one of the fundamental discov-
ery tools in a network scientist’s toolkit. The operation is
widely used in a variety of application domains including (but
not limited to) social networks, biological networks, internet
and web networks, citation and collaboration networks,
etc. [1].
Designing efficient algorithms and implementations for
community detection has been an area of active research for
well over a decade. While the community detection problem
is NP-hard [2], there are several efficient heuristics and related
software already available for static community detection —
e.g., [3]–[9]. For a review of works in both sequential and par-
allel settings, please refer to [1], [10].
However, many real-world networks are dynamic in nature,
where vertices and edges can be added and/or removed over a
period of time. For instance, consider a collaboration network,
where over time, new collaborations can be added between
pairs of users, or existing collaborative links may cease to
exist [11]. Similarly, consider the social interactions over time
defining edge additions and removals on a social network [12].
With increasing availability of high-throughput sensing,
numerous such network application domains are experiencing
an explosion of incrementally or dynamically growing net-
works [13]. Therefore, analytical solutions that factor in the
dynamic evolving nature of these graph inputs are necessary.
Owing to the increasing availability of such networks, the
problem of dynamic community detection has become an
actively researched topic of late [14]–[18]. Section II provides
an overview of related literature. Broadly, these algorithms can
be categorized in two types. The first class are algorithms that
identify communities from each time snapshot of the net-
work [19], [20], i.e., without considering the information from
previous time steps. These algorithms tend to be better suited for
networks that change rapidly between time steps. The other class
of algorithms uses information from the community structures of
the previous time steps to compute the communities of new time
steps [14], [15], [21]. The idea is to reduce time to solution by
avoiding re-computation, while also helping with easier tracking
of communities across time steps. For this latter class of algo-
rithms, a key remaining challenge in the design of these algo-
rithms is in quickly identifying the parts of the graph that are
likely to be impacted by a change (or collectively by a recent
batch of changes), so that it becomes possible to update the com-
munity information with minimal recomputation effort [22],
[23]. Approaches that can efficiently exploit such dynamic fea-
tures have the potential to reduce the computational burden in
dynamic community discovery; however, such approaches have
not received much attention in the past.
Manuscript received December 24, 2020; revised March 1, 2021; accepted
March 14, 2021. Date of publication March 23, 2021; date of current version
July 7, 2021. The research was supported in part by U.S. National Science
Foundation (NSF) awards CCF 1815467 and OAC 1910213. Recommended
for acceptance by Dr. Xuelong Li. (Corresponding author: Ananth Kalyanaraman.)
The authors are with the School of Electrical Engineering and Computer
Science, Washington State University, Pullman, WA 99164 USA (e-
mail: neda.zarayeneh@wsu.edu; ananth@wsu.edu).
This article has supplementary downloadable material available at https://
doi.org/10.1109/TNSE.2021.3067665, provided by the authors.
Digital Object Identifier 10.1109/TNSE.2021.3067665
1614 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
2. Contributions: In this paper, we propose an algorithmic
technique called Delta-Screening (or D-screening,
for short) that can quickly identify parts of the graph impacted
by a new batch of change. The input is a dynamic graph with
an arbitrary (T) number of time steps, and with a new batch of
edge additions and/or edge deletions at the start of each time
step1
. Using this as input, our technique selects vertices for
processing at the start of each time step.
The technique is generic to any community detection
method that uses modularity [5] as its clustering objective.
Consequently, we present an incremental dynamic community
detection framework to detect communities from dynamic net-
works with multiple time steps. More specifically, the main
contributions are as follows:
i) We visit the problem of identifying vertex subsets that
are likely to be impacted by the most recent batch of
changes made to the graph. To address this problem,
we present a technique called D-screening, which
can be efficiently implemented and incorporated as
part of existing community detection methods that use
modularity.
ii) To demonstrate and evaluate this technique, we incor-
porated the technique into two well-known classical
community detection methods—namely, the Louvain
method [3] and the smart local moving (SLM)
method [7]—thereby generating two incremental clus-
tering implementations.
iii) Using these two implementations, we present a thor-
ough experimental evaluation on both synthetic and
real-world inputs. Our results show that the
D-screening technique is effective in pruning work
(to reduce recomputation effort) without compromising
on output quality. The algorithm shows up to 5
speedup over the dynamic baseline and 38 over the
static baseline.
iv) In addition to demonstrating its performance benefits,
we also show how to use our approach to delineate
appropriate intervals of temporal resolutions at which
to analyze an input network.
The rest of the paper is organized as follows. We first
present a review of relevant literature. Then, we present
the D-screening technique, with provable guarantees.
Next, we present a unified clustering framework that uses
D-screening for dynamic community detection. The
experimental results section presents a thorough evaluation
of our implementations.
II. RELATED WORK
According to Rossetti and Cazabet [22] algorithms to com-
pute dynamic communities over time-evolving graphs can be
broadly classified into three types.
Static-Based Methods: One class of methods try to identify
communities based on the current state of the network.
Typically, these methods follow a two-step strategy of first
identifying the best set of communities for the current time
step and then subsequently mapping them onto the communi-
ties from previous generation(s) to track evolution. Hopcroft
et al. [19] present a method in which a static community
detection tool is individually applied to the graphs at all time
steps, and the results are later combined by computing com-
munity similarities between successive steps. Cuzzocrea et al.
[20] also follow a similar approach, while trying to recover
the evolution of events by matching the communities of two
consecutive time steps using normalized mutual information.
Alternatively, Asur et al. [24] apply a static community detec-
tion algorithm to construct a binary matrix that represents
node-to-community memberships for each time step. He et al.
[25] extend the Louvain algorithm [3] to detect community at
each time step, while performing community matching with
the communities of the previous time step. Seifikar et al. [26]
also present an extension of the Louvain algorithm, by using a
compression-based strategy to make the execution faster. Palla
et al. [27] present an algorithm based on clique percola-
tion [28] between consecutive time steps. Greene et al. [16]
present another variant of the two-step strategy, by mainly
focusing on a matching-based formulation to map the commu-
nities of the latest time step to the communities of previous
generations.
In general, these static-based two-step algorithms are not
temporally smoothed—making tracking of communities
harder. Furthermore, these methods incur the additional over-
head of running static implementations at each time step, and
mapping of communities to track the evolution also at every
time step—making the approach expensive.
Stability-Based Approaches: Another class of methods fol-
low a strategy where communities of the current time step are
identified based on the communities detected at previous time
step(s). These incremental approaches have the advantage of
generating more stable communities over time, thereby facili-
tating an easier tracking the evolution of communities. How-
ever, the divergence from the static methods is a potential
quality concern for these methods and needs to be evaluated.
Maillard et al. [21] present a modularity-based incre-
mental approach extending upon the classical Clauset-
Newman-Moore static method [6]. Aktunc et al. [15] pres-
ent the dynamic smart local movement (DSLM) method as
an extension of its static predecessor [7]. Xie et al. [29]
present an incremental method based on label propagation
which is a fast heuristic. Saifi and Guillaume [30] provide
a way to track and update community “cores” across time
steps. Zakrzewska and Bader [17] present another variant
that tracks the communities of a selected set of seed verti-
ces in the graph. The FacetNet approach introduced by Lin
et al. [31] is a hybrid approach that operates with a dual
objective of maximizing modularity for the current time
step while trying to also preserve as much of the previous
generation communities. While the modularity metric has
known to have limitations such as the resolution limit [32],
[33], it is still the most widely used objective in practice
(as represented in the aforementioned methods).
1
Note that vertex additions and deletions can be implicitly encoded as part
of edge additions and deletions as well. We, therefore, only refer to edge
related events in the remainder of this paper.
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1615
3. Agarwal et al. [18] propose an incremental approach to
detect communities based on permanence [34] as its clustering
objective. Zeng et al. [35] present a consensus-based approach
to finding communities across time steps. Their approach uses
a genetic algorithm to solve for a multiobjective problem
involving normalized mutual information (NMI) and modular-
ity. Another multi-objective algorithm was introduced in [36].
In a more recent study, Zhao et al. [37] present an incremental
algorithm to cases where bulk changes happen via subgraph
additions or deletions. Wu et al. [38] present an incremental
approach to detect overlapping communities using a weighted
network formulation. DynaMo [39] is an incremental method
that updates the community structure at each time step accord-
ing to the changes in the graph. In other words, this approach
also avoids recomputing communities from scratch based on a
set of local rules. Our results section includes a comparison
with DynaMo.
Cross-Time Step Approaches: Another class of methods
work across time steps and identify communities at any given
time step based on the knowledge of the entire graph across
all time steps. Tantipathananandh et al. [40] present an algo-
rithm to identify temporally persistent groups across multiple
time steps, and use a combinatorial optimization procedure to
identify community structures. Their algorithm assumes that
different time steps could contain information pertaining to
different parts of the graph. Bassett et al. [41] propose and
evaluate the choice of alternative null hypothesis models in
modularity-based dynamic community detection methods.
Matias et al. [42], [43] introduce a Poisson Process Stochastic
Block Model (PPSBM) to detect communities in temporal net-
works. Xu and Hero [44] defined a dynamic stochastic block
model in which both affiliations and densities can change. In
Viard et al. [45] and Himmel et al. [46], the authors consider
the stream of the links and look for persistent minimal size cli-
ques. Bu et al. [47] presents a multi-objective optimization
framework for community detection.
The advantage of these cross-time step algorithms is that
they are better at detecting temporally persistent communities.
However these algorithms require prior knowledge of the
global graph. In addition, several of these methods compare
the graphs and/or community states from multiple time steps,
incurring a higher runtime complexity.
The technique of D-screening proposed in this paper is
orthogonal to all the above efforts, in that it is aimed at helping
incremental methods (which fall into the category of stability-
based approaches) to be able to quickly identify the relevant
parts of the graph that are potentially impacted by a recent
batch of changes, so that the computation effort in the incre-
mental step can be reduced without compromising on cluster-
ing quality. Consequently, the D-screening technique can
be used to accelerate the computation in any of the incremen-
tal approaches. We demonstrate in this paper, the utility of
this technique using the modularity-optimization function. A
preliminary version of the method proposed in this paper
appeared in [48]. That preliminary version of the method only
addressed edge additions. In contrast, the work presented in
this paper works for both addition and deletion events.
III. METHOD
A. Basic Notation and Terminology
A dynamic graph GðV; EÞ can be represented as a sequence
of graphs G1ðV1; E1Þ, G2ðV2; E2Þ; . . . GT ðVT ; ET Þ, where
GtðVt; EtÞ denotes the graph at time step t; we use nt ¼ jVtj
and Mt ¼ jEtj. In this paper, we consider only undirected
graphs. The graphs may be weighted—i.e., each edge ði; jÞ 2
Et is associated with a numerical positive weight vij 0; if
the graphs are unweighted, then the edges are assumed to be
associated with unit weight, without loss of generality. We
use mt to denote the sum of the weights of all edges in Gt—
i.e., mt ¼
P
ði;jÞ2Et
vij. We denote the neighbors of a vertex i
as GðiÞ ¼ fj j ði; jÞ 2 Etg. We denote the degree of a vertex i
by dðiÞ. The weighted degree of a vertex i, denoted by dvðiÞ,
is the sum of weights of all edges incident on i.
In this paper, we consider both incrementally growing and
incrementally shrinking dynamic graphs—i.e., edges (and ver-
tices) can be added and deleted at each time step. We denote
the newly added/deleted set of edges at any time step t as Dt,
which is given by Dtþ [ Dt, where Dtþ ¼ Et n Et1 and
Dt ¼ Et1 n Et.
We denote the set of communities detected at time step
t as Ct. Note that, by definition, Ct represents a partitioning
of the vertices in Vt—i.e., each community C 2 Ct is a
subset of Vt; all communities in Ct are pairwise disjoint;
and
S
C2Ct
C ¼ Vt.
For any vertex i 2 Vt, we denote the community containing
i, at any point in the algorithm’s execution, as CðiÞ, following
the convention used in [49]. Also, let ei!C denote the sum of
the weights for the edges linking vertex i to vertices in com-
munity C—i.e., ei!C ¼
P
j2CGðiÞ vij. Furthermore, let aC
denote the sum of the weighted degrees of all vertices in C—
i.e., aC ¼
P
i2C dvðiÞ.
Given the above, the modularity, Qt, as imposed by a com-
munity-wise partitioning Ct over Gt, is given by [5]:
Qt ¼
1
2mt
X
i2Vt
ei!CðiÞ
1
2mt
X
C2Ct
a2
C
!
(1)
Given a community-wise partitioning on an input graph, the
modularity gain that can be achieved by moving a particular
vertex i from its current community to another target commu-
nity (say CðjÞ) can be calculated in constant time [3]. We
denote this modularity gain by DQi!CðjÞ. More specifically:
DQi!CðjÞ ¼
ei!CðjÞ[fig ei!CðiÞnfig
2mt
þ
dvðiÞ:aCðiÞnfig dvðiÞ:aCðjÞ
ð2mtÞ2
(2)
Note that if a vertex i is to migrate out of its current com-
munity, it can realistically move only to a neighboring com-
munity associated with one of its neighbors (as otherwise the
gain cannot be positive). Consequently, computing a target
community that yields the maximum modularity gain for a
1616 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
Hypergraph for overlapping communities?
4. given vertex, can be achieved in time proportional to dðiÞ.
This result is used in many algorithms [3], [4], [7], [49].
B. Problem Statement
We define the dynamic community detection problem as
follows.
Dynamic Community Detection: Given a dynamic graph
GðV;EÞ with T time steps, the goal of dynamic community detec-
tion is to detect an output set of communities Ct at each time step
t, that maximizes the modularity Qt for the graph GtðVt;EtÞ.
Since the static version of the modularity optimization prob-
lem is NP-Hard [2], it immediately follows that the dynamic
version is also intractable.
For the static version, there exist several efficient heuristics
have been developed (as surveyed in [1]). These approaches
can be broadly classified into three categories: divisive
approaches [50], [51], agglomerative approaches [5], [6], and
multi-level approaches [4], [7], [52]. Of these, the multi-level
approaches have demonstrated to be fast and effective at pro-
ducing high-quality partitioning in practice. In Algorithm 1
we show a generic algorithmic pseudocode for this class of
approaches. While they vary in the specific details of how
each step is implemented, they share several common traits
(note that this description is for the static case):
At the start of each level, all vertices are assigned to a
distinct community id.
An iterative process is initiated, in which all vertices are
visited (in some arbitrary order) within each iteration,
and a decision is made on whether to keep the vertex in
its current community, or to migrate it to one of its
neighboring communities. This decision is typically
made in a local-greedy fashion. For instance, in the
Louvain algorithm [3], a vertex migrates to a neigh-
boring community that maximizes the modularity gain
of that vertex—i.e., let j 2 GðiÞ [ fig. Then,
CðiÞ arg max
CðjÞ
DQi!CðjÞ
When the net modularity gain resulting from an itera-
tion drops below a certain threshold t, the current level
is terminated (i.e., intra-level convergence), and the
algorithm compacts the graph into a smaller graph by
using the information from the communities. This pro-
cedure represents a graph coarsening step, and the
coarsened graph is subsequently processed using the
same iterative strategy until there is no longer an appre-
ciable modularity gain between successive levels.
Algorithm 1 succinctly captures the main steps of the multi-
level approaches.
C. A Naive Algorithm for Dynamic Community Detection
A simple approach to implement dynamic community
detection is to simply apply the static version of the algorithm
(Algorithm 1) on the graph at every time step. However, such
an approach could suffer from lack of stability in the output
communities. Furthermore, by ignoring the previous commu-
nity information, the algorithm is essentially forced to recom-
pute from scratch, and as a result evaluate the community
affiliation for all vertices at each time step. This can be waste-
ful in computation. Consider an edge ði; jÞ 2 Dt that has been
newly introduced at time step t; it is reasonable to expect that
only those vertices in the “vicinity” of this newly added edge
to be impacted by this addition. However, the naive strategy is
not suited to exploit such proximity information, thereby neg-
atively impacting performance particularly for large real-
world networks where event-triggered changes tend to happen
in a more localized manner at different time steps. The
D-screening scheme described in this paper is aimed at
providing an alternative to this naive approach by overcoming
the above set of limitations.
D. The D-screening Scheme
In what follows, we describe our D-screening scheme
in detail. Given the graph (Gt) and changes (Dt) at time step t,
the goal of D-screening is to identify a vertex subset Rt
Vt for reevaluation at time step t—i.e., any vertex that is added
to Rt by the D-screening scheme will be evaluated for
potential migration by the iterative clustering algorithm
(Algorithm 1); all other vertices are not evaluated (i.e., they
retain their respective community assignment from the previ-
ous time step t 1). The main objective is to save runtime by
reducing the number of vertices to process, without signifi-
cantly altering the quality. Despite its heuristic nature, how-
ever, our D-screening scheme is designed to preserve the
key behavioral traits of the baseline version (as we show in
lemmas later in this section).
Since at each time step t we could have both deletion and
addition of edges, at first we present our algorithm in two
parts, one for handling edge deletions and the other for han-
dling edge additions. Note that by capturing edge additions
and deletions, we also implicitly capture vertex additions and
deletions. Finally, we also present a unified algorithm that
accounts for time steps where both additions and deletions
occur. In the following sections, we present our algorithms for
Algorithm 1. Abstraction for Multi-level Approaches
Input: GðV; EÞ
Output: An assignment P : V ! Z
1: Initialize P by setting CðvÞ fvg; 8v 2 V
2: repeat
3: repeat
4: for each v 2 V do
5: Compute a local (greedy) function gðv; CðvÞÞ
6: CðvÞ Update community assignment for v using the
results from gðv; CðvÞÞ
7: end
8: Compute a global quality function Q for P
9: until Convergence based on Q
10: Review communities of P (optional step)
11: GðV; EÞ Coarsen by performing graph compaction for next
level
12: until Convergence based on Q
13: return P
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1617
5. edge additions,edge deletions, and finally a combined frame-
work to handle both events.
1) The D-screening Approach for Edge Additions: Let
Dtþ denote the set of edges added at time step t, and Rtþ
denote the set of nodes for reevaluation. Note that the objec-
tive here is to compute Rtþ. For simplicity, we will assume
that these edges did not exist with the same edge weight, at
the previous time step t 12
We also assume that Dtþ is
stored as a list of ordered pairs of the form ði; jÞ. This implies
that for each newly added edge ði; jÞ, there will be two entries
stored in Dtþ: ði; jÞ and ðj; iÞ, as the input graph is undirected.
We refer to the first entry (i) of an ordered pair (ði; jÞ) as the
“source” vertex and the other vertex (j) as the “sink”. Let SDt
denote the set of all source vertices in Dtþ, and let TDt ðiÞ
denote the set of all sinks (or “targets”) for a given source
i 2 SDt .
Algorithm 2 shows the algorithm for D-screening for
edge additions. The algorithm can be described as follows.
We initialize Rtþ to ;. Subsequently, we examine all edges of
Dtþ in the sorted order of its source vertices, breaking ties
arbitrarily. Sorting helps in two ways: it helps us to consider
all the new edges incident on a given source vertex collec-
tively and identify an edge that maximizes the net modularity
gain for the source (line 4 of Algorithm 2). This way we are
able to mimic the behavior of the baseline versions which also
use the same greedy scheme to migrate vertices. Furthermore,
this sorted treatment helps reduce overhead by helping to
update Rtþ in a localized manner (relative to the source verti-
ces) and avoiding potential duplication in the computations
associated with the source vertex.
Once sorted, we read the list of edges added for each
source vertex, identify a neighbor (j) that maximizes the
modularity gain (line 4 of Algorithm 2), and update Rtþ
based on that vertex (line 8). However, prior to updating
Rtþ, we check if the selected vertex j has a better incentive
to move to i’s community Ct1ðiÞ (line 7); if that happens,
then Rtþ is not updated from source i and instead, that deci-
sion is deferred until j is visited as the source. This way we
avoid making conflicting decisions between source and sink,
while decreasing the time for processing (by reducing Rtþ
size). Note that we only use the direction of migration that
results in the larger of the two gains for updating Rtþ. Note
that the actual decision to migrate itself is deferred until the
stage of execution of the iterative clustering algorithm. In
other words, the D-screening procedure does not modify
the state of communities, but it sets the stage for which verti-
ces (in which communities) to be visited during the main
iterative procedure.
The main part of Algorithm 2 is in line 8, where Rtþ is
updated. Our scheme adds the following subset to Rtþ: verti-
ces i and j, all neighbors of i (i.e., GðiÞ), and all vertices in
the community containing j. In what follows, using a combi-
nation of lemmas, we show that the Rtþ so constructed is
positioned to capture all (or most of the) “essential” vertices
that are likely to be impacted by the edge additions in Dtþ. In
other words, if a vertex is left out of Rtþ, it can be concluded
that it is less likely (if at all) to be impacted by the changes to
the graph, and therefore it can stay in its previous community
state—thereby saving runtime.
Provable Properties for D-screening: Edge additions
In the following lemmas, for sake of convenience (and
without loss of generality), we analyze the potential impact of
the event represented by moving i to j’s community. Intui-
tively, the key to populating Rtþ is in anticipating which verti-
ces are likely to alter their community status triggered by this
migration event.
First, note that there are two simple cases. If one of the two
endpoints of an added edge is new, then the new node will
simply join the community of the other endpoint. Also, if both
endpoints are new, then they will be in a new community with
these two vertices. In what follows, we prove certain proper-
ties for the more non-trivial cases.
Fig. 1 shows the various representative cases that originate
for consideration in our lemmas.
First, we claim that any vertex that is a neighbor of i can be
potentially impacted.
Lemma 1: Let i0
be a neighbor of a vertex i that has a newly
added edge to j. Then, the community assignment for i0
at the
end of time step t, i.e., Ctði0
Þ, could be potentially different
from its previous community Ct1ði0
Þ.
Algorithm 2. D-screening for edge additions at time step t
Input: Gt, Dtþ
Output: Rtþ: a subset of vertices for reevaluation
1: Rtþ ;
2: Sort edges in Dtþ based on source vertices
3: foreach i 2 SDt do
4: Let j arg maxj2TDt
ðiÞfDQi!Ct1ðjÞg
5: Let gain1 DQi!Ct1ðjÞ
6: Let gain2 DQj!Ct1ðiÞ
7: if gain1 gain2 and gain1 0 do
8: Rtþ Rtþ [ fi; jg [ GðiÞ [ Ct1ðjÞ
9: end
10: end
11: return Rtþ
Fig. 1. Figure showing the impact of a newly added edge ði; jÞ, shown in red
dotted line. Representative cases of candidate vertices for potential inclusion
in Rt are shown highlighted as labelled vertices. Note that we follow the nam-
ing convention of denoting the community containing a vertex i by CðiÞ.
Also, note that not all edges are shown.
2
If not, then we can simply skip over such edges during processing.
1618 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
6. Intuitively, a change in i’s community state could influence
any of its neighboring vertices. For a formal proof, see Supple-
mentary section.
Next, we analyze the potential of vertices that are in Ct1ðiÞ
but not in GðiÞ to be impacted by the migration of i. In fact, we
conclude that there is no need to include such vertices in Rtþ.
Lemma 2: If i0
2 Ct1ðiÞ, then at time step t, a change to
the community state of i0
, Ct1ði0
Þ, is possible, only if i0
is also
a neighbor of i.
Proof: We have already considered the case where i0
2
GðiÞ (as part of Lemma 1 in the Supplementary section).
Therefore we only need to consider the case where i0
=
2 GðiÞ.
This is represented by vertex label i2 in Fig. 1. Since i2 is
already in Ct1ðiÞ and i2 does not share an edge with i, a
departure of i from Ct1ðiÞ can only positively reinforce i2’s
membership in Ct1ðiÞ. More formally, this can be shown by
comparing the old vs. new modularity gains, DQi2!CðjÞ,
resulting from moving i to Ct1ðjÞ:
DQnew
i2!Ct1ðjÞ ¼
ei2!Ct1ðjÞ ei2!Ct1ðiÞnfi2g
2mt
þ
dvði2Þ:ðaCt1ðiÞnfi2g dvðiÞÞ
ð2mtÞ2
dvði2Þ:ðaCt1ðjÞ þ dvðiÞÞ
ð2mtÞ2
¼ DQold
i2!Ct1ðjÞ
2dvði2Þ:dvðiÞ
ð2mtÞ2
Since the new modularity gain is less than the old, vertex i2
will have no incentive to change community status, and there-
fore can be excluded from Rt.
Note that the above proof only analyzes the direct impact
that vertex i’s migration from Ct1ðiÞ will have on non-neigh-
bor vertices of Ct1ðiÞ. However, there could still be an indi-
rect impact (cascading from Lemma 1)—e.g., the migration of
a vertex i1 2 GðiÞ may trigger the migration of a vertex
i2 =
2 GðiÞ but in Gði1Þ, and so on. However, intuitively, as this
distance from the locus of change increases, the likelihood of
its impact can be also expected to decay rapidly in practice (as
observed in our experiments).
Next, we analyze the potential impact of i’s migration on
members of j’s community.
Lemma 3: If a vertex j1 2 Ct1ðjÞ, then at time step t, a
change to the community status of any such j1 is possible.
Proof: Intuitively, a new vertex (i) arrives in community
Ct1ðjÞ, the degree of such a vertex contributes to the nega-
tive term of modularity, thereby possibly influencing the other
vertices’ decision to remain in that community. A formal
proof is provided in the Supplementary section.
Finally, we analyze the impact of i’s potential migration
from Ct1ðiÞ to Ct1ðjÞ, on vertices that are in neither of
those two communities and are also not in GðiÞ.
Lemma 4: If k 2 VtnfCðiÞ [ CðjÞg, then at time step t,
unless k is also in GðiÞ, there is no need to include k in Rtþ.
Proof: We consider only vertices k =
2 GðiÞ, as the other
case was already covered in Lemma 1. There are three
subcases:
(A) k shares an edge with some vertex in Ct1ðiÞ except i;
(B) k shares an edge with some vertex in Ct1ðjÞ; and
(C) k has no neighbors in Ct1ðiÞ or Ct1ðjÞ.
However, in none of these cases a migration of i to
Ct1ðjÞ, create an incentive for k to move to Ct1ðjÞ. This
can be shown more formally using modularity gains as
follows.
(1) This case is similar to the subcase (B) identified in the
proof of Lemma 1. The same argument holds.
(2) Any node k connected to Ct1ðjÞ which is located in
other communities have the following modularity gain
after moving i to Ct1ðjÞ:
DQnew
k!Ct1ðjÞ
ek!Ct1ðjÞ ek!Ct1ðkÞnfkg
2mt
þ
dvðkÞ:aCt1ðkÞnfkg
ð2mtÞ2
dvðkÞ:ðaCt1ðjÞ þ dvðiÞÞ
ð2mtÞ2
DQold
k!Ct1ðjÞ
dvðkÞ:dvðiÞ
ð2mtÞ2
This implies that the modularity gain will decrease if k
were to migrate to Ct1ðjÞ, and therefore k will
choose to remain in its current community.
(3) If a vertex is not connected to either Ct1ðiÞ or
Ct1ðjÞ, such as the example k0
in Fig. 1, then such a
vertex has no incentive to move to either of the two
communities.
Based on the above lemmas and arguments, the following
theorem follows.
Algorithm 3. D-screening for edge deletion at time step t
Input: Gt, Dt
Output: Rt: Subset of vertices for reevaluation
1: Rt ;
2: Sort edges in Dt based on the source vertices
3: for each source i 2 SDt do
4: flag = False
5: for each sink j 2 TDt ðiÞ do
6: if Ct1ðiÞ ¼ Ct1ðjÞ then
7: flag = True
8: end
9: end
10: if flag = True then
11: Rt ¼ Rt [ Ct1ðiÞ [ GðiÞ
12: end
13: end
14: return Rt
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1619
7. Theorem 5: If an inter-community edge ði; jÞ is added, then
the set of vertices that could be possibly affected by this
change is given by: Rtþ¼ GðiÞ [ ðCt1ðjÞ n GðiÞÞ.
2) The D-screening Method for Edge Deletion: Simi-
lar to the D-screening approach for edge addition intro-
duced in the previous section, we identify a subset of vertices
for reevaluation, at each time step with edge removals.
Algorithm 3 presents the D-screening scheme for han-
dling edge removals. We assume that Dt stores the set of all
deleted edges of a given time step t, and that for each deleted
edge ði; jÞ, there will be two entries in Dt: ði; jÞ and ðj; iÞ.
We use the same definitions as in the previous section, to
denote the sets of source and sink vertices: SDt and TDt ðiÞ,
respectively, in this deletion batch.
In Algorithm 3, we have Gt and Dt as inputs, and Rt as out-
put. Here, Rt represents the nodes that need to be reevaluated.
At first, we observe that if the edge being removed is
between two different communities, then such a deletion could
not possibly result in a change of the existing community
states (as the strength of the tie between the two communities
only becomes weaker). Therefore, we discard such inter-com-
munity edge deletions from further consideration in our
D-screening scheme. Lines 4 through 9 of Algorithm 3
implement this inter-community check for all deleted edges
incident on a source vertex. As the edge list in Dt is kept
sorted by source vertices (line 2), it is easy to check and dis-
card such source vertices that have all of their incident deleted
edges between two communities.
We therefore only consider source vertices i which have at
least one intra-community deleted edge incident on them—
i.e., edge ði; jÞ where Ct1ðiÞ ¼ Ct1ðjÞ. For such source ver-
tices, the update scheme to Rt is shown in line 11 of Algo-
rithm 3. Our method adds all vertices in the community
containing i along with the neighbors of i, to Rt. Through a
combination of lemmas (stated below), we show that this
update scheme sufficiently captures the effects of the new
batch of deletions.
Provable Properties for D-screening: Edge deletions
In the following lemmas we analyze the potential impact of
edge removals. Let ði; jÞ be an intra-community edge being
removed at time step t from community Ct1ðiÞ (equivalently,
Ct1ðjÞ). We consider the two following possibilities:
Case A: the deletion event could result in the migration
of vertices i or j (or both) to a (different) neighboring
community;
Case B: the deletion event could result in the breaking
of community Ct1ðiÞ into two or more smaller
communities.
Our main strategy is to populate Rt to include those verti-
ces which are likely to alter their community states as a result
of this deletion event.
First, it should be clear that the deletion event has a direct
impact on the vertices i and j—i.e., it is possible that either of
i’s or j’s strength of connection to Ct1ðiÞ could weaken as a
result of this deletion, and therefore either of these two verti-
ces could possibly choose to leave Ct1ðiÞ at time step t.
Therefore, we simply include i and j as part of our Rt.
Next, we show that none of the vertices in Ct1ðiÞ other
than i and j, will have any incentive to migrate out of Ct1ðiÞ
to any of the neighboring communities3
.
Lemma 6: Let ði; jÞ 2 Dt be an intra-community edge to
be deleted. Consider a vertex i0
2 Ct1ðiÞ n fi; jg. Then, the
modularity gain achieved by moving vertex i0
out of Ct1ðiÞ
will be zero or negative.
Proof: This case is represented by vertex labels i1 (immedi-
ate neighbor of i) and i2 in Fig. 2. Here, removing the edge
ði; jÞ has no impact on the strength of vertex i1’s or i2’s con-
nection to community Ct1ðiÞ; i.e., the contribution of these
vertices to the positive term in Eqn. 1 will remain the same.
On the other hand, the negative term in Eqn. 1 for the commu-
nity Ct1ðiÞ will only improve with the removal of edge ði; jÞ,
owing to reduced vertex degree. Therefore, neither vertex i1
nor vertex i2 will have any incentive to leave Ct1ðiÞ during
time step t.
Even though the above lemma shows that none of the verti-
ces in Ct1ðiÞ other than i and j, have an incentive to move
out of Ct1ðiÞ, we note that it is still possible for those vertices
to split to form smaller, sub-communities by splitting
Ct1ðiÞ—i.e., Case B. Intuitively, this is because of the weak-
ening of the strength of the intra-community edges within
Ct1ðiÞ, resulting from the deleted edge. Therefore, to con-
sider this possibility, our D-screening method simply
adds all the vertices in Ct1ðiÞ for re-evaluation (as shown in
line 11 of Algorithm 3).
Next, we consider the impact of the edge deletion ði; jÞ on
neighbors of i (alternatively, j) which are outside Ct1ðiÞ. The
lemma is shown for only neighbors of i, and the same argu-
ment holds for neighbors of j as well.
Lemma 7: Let ði; jÞ 2 Dt be an intra-community edge to
be deleted. Consider a vertex k 2 GðiÞ n Ct1ðiÞ. Then, it is
possible for vertex k to have an incentive to change its com-
munity at time step t.
Proof: This case is represented by vertex label k in Fig. 2.
A formal proof is provided in the Supplementary section.
Based on the above lemmas and arguments, the following
theorem follows.
Theorem 8: If an intra-community edge ði; jÞ is deleted,
then the set of vertices that could be possibly affected by this
change is given by: Rt¼ Ct1ðiÞ [ GðiÞ [ GðjÞ.
Fig. 2. Figure showing the impact of a deleted edge ði; jÞ, shown in red dot-
ted line. Representative cases of candidate vertices for potential inclusion in
Rt are shown highlighted as labelled vertices. Note that we follow the naming
convention of denoting the community containing a vertex i by CðiÞ. Also,
note that not all edges are shown.
3
Note that this does not exclude the possibility of those vertices choosing
to separate into smaller contained communities within Ct1ðiÞ at the end of
time step t (case B above).
1620 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
8. E. A General Framework for Dynamic Community Detection
Using D-screening
Here, we present our unified dynamic community detection
framework, based on the D-screening scheme, capable of
handling both edge additions and edge deletions. Algorithm 4
summarizes the main steps of the algorithm over all T time
steps. Our algorithm uses D-screening to prune the vertex
space for processing, before handing over the actual clustering
task to a static clustering code of choice, albeit only on the
vertex subset selected in the Rt. We note here that our algo-
rithm is generic enough to be applied to any modularity-based
static clustering scheme (denoted StaticðÞ in Algorithm 4). In
this work, we used the Louvain method [3] and smart local
moving (SLM) method [7], as choices for Static (described in
the Experimental Evaluation section).
At the start of every time step t, we assume that the batch of
changes Dt arrives simply as an unordered collection of edge
additions and edge deletions (i.e., Dt ¼ Dtþ [ Dt). In our
algorithm, we first sort out the additions from the deletions,
and then first process the edge deletions prior to edge addi-
tions—as shown in Algorithm 4. The motivating rationale
behind this order of computation is a combination of perfor-
mance and quality—i.e., by first removing edges, we shrink
the graph and update the communities, before growing the
graph again using the added edges and updating the communi-
ties. This strategy can be expected to lower processing times
for each batch than doing it in an arbitrary order or handling
additions first. Secondly, from a qualitative standpoint, we
note that the strict order used in processing these edge addi-
tions and deletions could potentially impact output clustering.
Fig. 3 shows the runtime comparison of first processing the
edge deletions and then edge addition and conversely, if the
edge addition gets processed first. The acquired clustering
quality is 0.658 227 and 0.652 412, respectively, we have also
checked the community overlap percentage for these two
approaches and obtained around 95% overlap. However, as
shown in the lemmas for our D-screening scheme, the
effects of deletions are expected to be within communities
(splits) whereas the effects of additions are expected to be
across. Given this complementary nature of their effects, our
algorithm chooses to optimize performance in practice by con-
sidering deletions first. It is worth noting that the challenge
imposed by the order of processing the edge addition/deletion
events also applies to the static clustering, and is not unique to
the dynamic setting.
Note that there is also a simpler version that can be imple-
mented for both these methods—by following all steps out-
lined in our approach except for D-screening and instead
trivially setting Rt¼Vt. For a comparative assessment of the
D-screening strategy, we implemented this baseline ver-
sion as well—we refer to the resulting implementations as
dyn-base4
.
F. Algorithmic Complexity of D-screening
The D-screening procedures described for edge additions
and edge deletions have a worst-case linear time complexity in
the size of the graph at time step t, i.e., Oðnt þ MtÞ. The worst-
case represents a scenario where the changes are distributed across
the entire graph, possibly impacting almost all of the previous
communities. However, in practice, both procedures are expected
to take significantly less time as the computations are localized to
where the changes (Dt) could have an impact.
Note that the time taken to compute the new clustering after
the D-screening step of that time step (denoted by Step 6 of
Algorithm 4) will depend on the computational cost of the
underlying clustering algorithm. For instance, in the case of
the Louvain implementation [3], clustering involves multi-
ple iterations until a modularity-based convergence, with each
iteration performing a linear scan of vertices in the worst-case.
With the application of D-screening, we aim to significantly
reduce this number of vertices visited.
G. Software Availability
We have implemented the D-screening technique and
integrated it as part of the Louvain method. This open source
Algorithm 4. Dynamic community detection using
D-screening
Input: G1, fD1þ; D1; . . .; DTþ; DTg
Output: fC1; C2; . . .; CT g
1: Let C1 StaticðG1Þ denote the initial step output
2: for each t 2 f2; 3; . . .; Tg do
3: Initialize Ct Ct1
4: //handle deletions
5: Gt Update Gt1 using Dt
6: Compute Rt D-screening(Gt; Dt)
7: Update Ct StaticðGt; RtÞ—i.e., run static clustering only
for vertices in Rt
8: //handle additions
9: Gt Update Gt using Dtþ
10: Compute Rtþ D-screening(Gt; Dtþ)
11: Update Ct StaticðGt; RtþÞ—i.e., run static clustering only
for vertices in Rtþ
12: end
13: return fC1; C2; . . .; CT g
Fig. 3. Runtime comparison between processing edge deletion following by
edge addition vs. processing edge addition following by edge deletion.
4
We note that our dyn-base for SLM method implementation is in
effect same as [15].
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1621
9. implementation is available at https://bitbucket.org/nzaraye-
neh/dynamic_community_detection/src/master/. The software
package is implemented in C++ and can be used for modular-
ity-based community detection on dynamic networks.
IV. EXPERIMENTAL EVALUATION
A. Experimental Setup
All our experiments were conducted on a Linux compute
node with an 2.13 GHz AMD Ryzen thread ripper 1920x pro-
cessor with 64 GB RAM.
Input Data: For all our experiments, we used a combination
of synthetic and realworld networks. Table I summarizes the
main input statistics for the inputs used. Fig. S1 (in the Supple-
mentary section) shows the numbers of edge addition/dele-
tions for the individual time steps.
As synthetic inputs, we used a collection of streaming net-
works available on the MIT Graph Challenge 2018 web-
site [53]. We used two types of networks: i) Low block
overlap, Low block size variation (abbreviated as “ll”),
and ii) High block overlap, High block size variation (abbre-
viated as “hh”). These two types are in the increasing order
of their community complexity (ll hh). However, in
both cases, the number of edges grows (or shrinks) linearly
with each time step (see fig. S1. The datasets are available
from sizes of 1 K nodes to 20 M nodes, and each of these
datasets has ten time steps. For our testing purposes, we
used the 50 K and 5 M data sets as representatives of two
size categories.
As real-world inputs, we used two networks downloaded
from SNAP database [54]:
1) Arxiv HEP-TH: This is a citation graph for 27 770
papers (vertices) with 352 807 edges (cross-citations).
Even though the edges are directed, for the purpose of
analysis in this paper we treated them as undirected.
The dataset covers papers published between 1993 and
2002. Consequently, we partitioned this period into 10
time steps (one for each year).
2) SX-stackoverflow: This is a temporal network of interac-
tions on Stack Overflow, with 2 601 977 vertices (users)
and 63 497 050 temporal edges (interactions). Interactions
could be one of many types—e.g., a user answers another
user’s query, a user commented on another user’s answers,
etc. We treated all these interactions equivalently (as
edges), and for the purpose of our analysis we used only
the first instance of a user-user interaction as an edge.
3) Enron Email: This communication network covers all
the email communication within a dataset of around
half million emails. This data was originally made pub-
lic, and posted to the web, by the Federal Energy Regu-
latory Commission during its investigation. Nodes of
the network are email addresses, and edges represent
(undirected) email exchanges between pairs of users.
We partitioned this data set into 45 time steps based on
the dates of the emails.
For testing purposes, we constructed three types of net-
works using the above network inputs, as shown in Table I:
Networks whose labels end with ‘+’ are incrementally
growing networks—i.e., at each time step, edges are
only added. Note that these ‘+’ networks are the same
as the original input.
Networks whose labels end with ‘-’ are incrementally
shrinking networks—i.e., at each time step, edges are
only deleted. We constructed this sequence by simply
reversing the time steps from the ‘+’ networks.
Finally, networks whose labels end with ‘+/-’ contain a
combination of additions and deletions at each time
step. These networks were constructed by randomly
selected one edge (from the entire network) for deletion,
for every ten edges added in the given data set, at each
time step.
Implementations Tested: In our experiments, we tested the
following implementations:
1) Static: This is a (static) community detection code run
from scratch, on the entire graph of a time step i (after
incorporation of all edge additions and removals).
Louvain [3] and SLM [7] are the two tools we used
for this purpose.
2) Baseline: This is a community detection code run incre-
mentally on the graph at each time step i. “Incremental”
TABLE I
INPUT NETWORK STATISTICS. ALL THE NETWORKS USED HAVE MULTIPLE TIME STEPS (SHOWN IN THE RIGHTMOST COLUMN), AND THE CUMULATIVE NUMBER
OF EDGES REPRESENTS PEAK NUMBER OF EDGES ACROSS ALL THOSE TIME STEPS. NETWORKS WHOSE LABELS END IN ‘+’ CORRESPOND TO GROWING
NETWORKS (I.E., EDGES BEING ADDED EACH TIME STEP), AND THOSE THAT END IN ‘-’ CORRESPOND TO SHRINKING NETWORKS. THE NETWORKS WHOSE
LABELS ARE DENOTED BY ‘+/-’ HAVE A COMBINATION OF EDGE ADDITIONS AND DELETIONS AT EACH TIME STEP
1622 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
10. here implies that at the start of every time step i, we ini-
tialize the state of communities to that of the end of the
previous time step i 1 (for i 0). For this purpose,
we implemented our own incremental version of the
Louvain tool—we call this dyn-base; and for
SLM, we use the already available incremental version,
which is DSLM [15].
3) Ds: This is a modified baseline version incorporated
with our D-screening step to identify the Rt set for
use within each time step. We refer to the resulting two
implementations as dyn-Ds (Louvain) and dyn-Ds
(SLM).
B. Runtime and Quality Evaluation of D-screening
Table II shows a summary of all the runtime and quality
results, comparing the D-screening implementation (Lou-
vain) against the static scheme, for all inputs tested. As can be
observed, the speedups obtained by D-screening ranges
from 1:13 to 38:62 over the static Louvain scheme and
from 1:14 to 5:55 over the dynamic Louvain baseline. The
smaller speedups belong to those networks with smaller size
or those synthetic networks where the location of the changes
were randomly distributed. In contrast, we see higher speed-
ups for larger real-world networks, and for those synthetic net-
works where there is a stronger community structure (’hh’
label) and changes tend to happen with more locality. These
results are achieved with little to no impact on the quality of
the results (shown in the modularity column). To further
understand these results, we present a thorough evaluation of
the runtime and quality results below.
Runtime Evaluation: First, we evaluate the impact of
D-screening technique on the clustering algorithm’s per-
formance and its filtering efficacy. Fig. 4 shows the runtimes
for all clustering iterations within all the time steps, for the dif-
ferent schemes on different inputs tested.
As can be observed, the dyn-Ds that uses D-screen-
ing, achieves the best performance, with the least runtime
per iteration compared to both the static and baseline
schemes. In fact, for some iterations, dyn-Ds is more
than 5:74 faster the static scheme, and 3:4 faster than
the baseline implementation.
The runtime savings are a direct result of the reductions in
the number of vertices that are selected for processing by
D-screening (i.e., Rt set size). Fig. 5 shows the Rt set
sizes (as a percentage of the respective total number of verti-
ces), selected for processing at each time step for the differ-
ent inputs. Note that the location of the changes to the graph
has a direct bearing on the Rt selected, and this accounts for
the fluctuation in the Rt sizes. For instance, for the input sx-
stackoverflow+/- the size of the changes increases linearly,
but as we can see from Fig. 5. for time step t2 the percentage
of vertices eligible for re-evaluation is low (around 25%)
while for t8 it is higher (around 60%). Upon further examina-
tion, we found that this variability is because in t2 the frac-
tion of edge additions falling within communities is 72%
and the fraction of edge removals happening between two
communities is 67%. In comparison, these two corre-
sponding percentages for time step t8 are 56% and 49%
respectively. Therefore, for t2, most of the changes do not
add new work and hence generate more savings, compared to
time step t8.
On average, for the smaller inputs, approximately between
10% and 20% savings in the number of vertices is achieved
at any time step. Notably, for sx-stackoverflow, which is also
the largest network tested, the savings are significantly
larger, ranging from 80% (at time step t1) to 38% (at time
step t9).
TABLE II
SUMMARY OF THE RESULTS
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1623
11. Even though both classes of input graphs (synthetic and
real-world) show linear growth rates in their respective sizes,
for the synthetic inputs it is harder to benefit from
D-screening because edges are connected almost ran-
domly from new to existing vertices (by an edge sampling
randomized procedure described in [53]); therefore, the loca-
tion of the changes spreads throughout the whole network
which is not ideal for D-screening. Whereas, in the real-
world network sx-stackoverflow, we observed that the
changes tend to happen more in a localized manner, giving a
realistic opportunity to benefit from D-screening. This
locality of changes tend to have a significant impact on the
filtering efficacy of the D-screening technique. In fact,
even between the two real-world networks, we observed a
divergence—with the ArXiv HEP-TH input, the savings are
relatively modest, compared to the large savings achieved
for sx-stackoverflow.
Next, we evaluate the total runtime including the time taken
to execute all levels. Note that in multi-level codes, the num-
ber of iterations per level may vary across the different imple-
mentations. Fig. 6 shows the runtime as a function of the time
steps, for the different inputs tested.
In all the charts, static shows the runtime for static
Louvain algorithm; add_base and add_Ds show the run-
times for the baseline and Ds implementations for edge
additions (i.e., on the ‘+’ inputs); and del_base and del_Ds
show the runtimes for the baseline (Ds) for edge deletions
(i.e., on the ‘-’ inputs).
We find that in all cases both baseline and Ds implementa-
tions consistently outperform the respective static implementa-
tion, providing more than two orders of magnitude in some
cases. Between the baseline and Ds implementations, the
Fig. 4. The average runtime per iteration, for all levels across all time steps, for two representative inputs: 5Mll+/-, 5Mhh+/-, and sx-stackoverflow+/-. The
average is given by the mean time to execute an iteration within each level of a time step. Note that the variance of runtimes for the iterations of a given level is
expected to be small, since the number of vertices processed per iteration remains the same through the level. All runs reported are from the Louvain-based
implementations.
Fig. 5. The fraction of vertices processed at every iteration, given by jRtj
jVtj . A
lower percentage corresponds to a larger savings in performance.
1624 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
12. difference varies based on the input. For the synthetic inputs,
both versions perform comparably with a slight advantage to the
Ds implementation in some time steps. As discussed earlier, this
can be attributed to the random nature of changes induced in the
synthetic inputs. For the real-world inputs, Ds significantly out-
performs baseline, yielding over 4 speedup at some time steps
(e.g., input: sx-stackoverflow+; time steps t6, t8).
Fig. 7 shows the runtime performance of the unified frame-
work (i.e., to handle a combination of additions and deletions;
Algorithm 4), as a function of the time steps. Here too, we
observe a similar trend, with the Ds scheme clearly outper-
forming static and performing comparably or outperforming
the baseline.
Quality Evaluation: We also evaluated the quality, as mea-
sured by the modularity of the output clustering [5], achieved by
the different clustering schemes. Fig. S2 and S3 (in the Supple-
mentary section) show the results of our qualitative evaluation.
Basically, in nearly all cases, we observed no difference in the
output modularity, across the three schemes—implying that the
performance gains from D-screening does not have any
negative impact on the quality.
C. Comparison With Other Tools
Louvain vs. SLM using D-screening: In order to
demonstrate the generic applicability of the D-screen-
ing scheme for modularity optimizing frameworks, we
also incorporated D-screening into the SLM algo-
rithm [7]. Fig. 8 shows a comparison between the runtimes
for the Louvain implementations versus the corresponding
SLM implementations, for the ArXiv HEP-TH network
input. First, we note that the gap between the D-screen-
ing and static implementations, is significantly larger for
SLM than for Louvain for this input. Secondly, note that
the timings for SLM are about an order of magnitude
larger than the timings for Louvain on this input. These
results collectively demonstrate that the D-screening
scheme has an even larger impact on generating savings
for SLM than for Louvain. We generally observed higher
modularity values in the output for SLM compared to
Louvain.
D-screening vs. DynaMo vs. Batch: We performed a
comparative evaluation of D-screening against two
other state-of-the-art methods, namely DynaMo [39] and
Batch [14] algorithms. Both provide incremental commu-
nity update implementations. The results are shown in
Table III. As can be observed, D-screening outper-
forms these other two methods in runtime and modularity
of output, with dynaMo coming in a close second on the
runtime and modularity. The memory footprint of the other
two tools were significantly larger than D-screening.
For the larger input (sx-stackoverflow), only D-screen-
ing was able to complete using the available memory
(64 GB).
Fig. 6. Parts (a), (b), (c), and (d) show the runtime for each time step, for the different growing inputs (’+’) and shrinking inputs (’-’).
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1625
13. Fig. 7. Plots (a), (b), (c), and (d) show the runtime for each time step, for the network inputs which have both edge additions and deletions (’+/-’).
Fig. 8. Runtime comparison of D-screening scheme against the baseline and static schemes, for both Louvain and SLM implementations. For instance,
part (b) shows SLM as the static baseline (shown in black), dynamic SLM (dSLM) as the dynamic baseline (shown in red), and the D-screening version
shown in blue.
TABLE III
COMPARATIVE EVALUATION OF RUNTIME AND MODULARITY FOR D-SCREENING, DYNAMO, AND BATCH ALGORITHMS. AN ENTRY ‘-’ IMPLIES THE RUN DID NOT
COMPLETE BECAUSE OF EXCEEDING MEMORY CAPACITY
1626 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
14. D. Effect of Varying the Temporal Resolution
In many real-world dynamic graph use-cases, even though
the input graph is available as a temporal stream, the appropri-
ate temporal scale to analyze them is not known a priori. In
fact, this scale is an input property that a domain expert
expects to discover through the process of computing the
dynamic communities. In order to facilitate such a study
through dynamic community detection, in this section, we
study the effect of varying the temporal resolution, as defined
by the number of time steps used to partition a graph stream,
on the output clustering.
More specifically, using the sx-stackoverflow input, we first
generated multiple temporal datasets, each of which represent-
ing the input stream divided into a certain number of time
steps, ranging from {2, 4, 8, 12, 16, 20, 24, 28} steps. Note
that in this scheme, there are multiple nested hierarchies—for
instance, the 16-time steps partitioning can be achieved by
splitting each of the 8-time steps partitions into two. Subse-
quently, we ran our Ds-enabled incremental clustering
method on the different temporal input datasets (we used the
dyn-Ds for Louvain in this analysis).
Fig. 9 shows the results of this temporal resolution
study. More specifically, Fig. 9 a shows, for the sx-stacko-
verflow+ input, the change in average modularity as we
increase the temporal resolution from coarser to finer (left
to right along the x-axis). Fig. 9 c shows the same for the
sx-stackoverflow+/- input. We observe that the modularity
values decline gradually until around 16 time steps, after
which the decline starts to accelerate. The decline in mod-
ularity suggests that the community-based structure of the
underlying network (at different scales) starts to weaken as
we increase the temporal resolution. This is to be expected
as the temporally binned graphs tend to only become
sparser with increasing resolution. Notably, the more rapid
slide that starts to appear after the 16 time steps-resolution
suggests that the community structure starts to deteriorate
after that resolution for this input.
Interestingly, this property is better captured by Figs. 9 b
and 9 d, which show the % savings in total runtime achieved
by our D-screening filtering scheme. Intuitively, when
the % savings remains approximately steady (highlighted by
the plateau region from the resolution of 4 time steps to 16
time steps), it means that the nature of the evolution of the
graphs within those resolutions is also relatively consistent.
However, a steeper decline (on either end of the plateau) sug-
gests that under those temporal resolution scales the tempo-
rally partitioned graphs become either too sparse (right) or
too dense (left).
Fig. 9. Plots showing the effect of varying the temporal resolution—as measured by the number of temporal bins (i.e., time steps) used to partition the input
graph stream. The resolution of partitioning changes from coarser to finer, from left to right on the x-axis.
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1627
15. Note that to be able to conduct these kinds of temporal reso-
lution studies, an application scientist should be able to run the
dynamic community detection algorithm multiple times under
different resolution configurations. This is one of the motiva-
tions for a fast implementation in practice.
V. CONCLUSION
Conducting community detection-based analysis on large
dynamic networks is a time-consuming step in many discov-
ery pipelines. In this paper, we visit a sub-problem in this con-
text—one of identifying vertices that are likely to be impacted
by a new batch of changes. We presented a generic technique
called D-screening that examines and selects provably
“essential” vertices for evaluation during any given time step,
based on the loci of the changes.
Subsequently we incorporated this technique into two
widely-used community detection tools (Louvain and SLM).
Our experiments demonstrated speedups of up to 5 and 38
respectively, over state-of-the-art dynamic and static baseline
implementation, without impacting the quality of the output.
These experiments were conducted for a collection of syn-
thetic and real-world inputs. Our tool is also more memory
efficient demonstrating the ability to run on a larger input (sx-
stackoverflow; 60 M) where other implementations run out of
memory. Our D-screening scheme is generic and can be
incorporated into any modularity-optimizing community
detection implementation. We also used the D-screening
implementation to identify appropriate temporal resolutions at
which to observe community structure within the dynamic net-
work. Future research directions include parallelization on
multicore platforms; and application to large-scale networks
toward a domain-specific analysis of the dynamic commu-
nities. The current implementation of the algorithm supports
only undirected graphs and extending the algorithm to directed
setting is also part of our future plan.
ACKNOWLEDGMENT
We thank Dr. Aurora Clark for providing us valuable input on
potential scientific use-cases to motivate method development.
REFERENCES
[1] S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486,
no. 3-5, pp. 75–174, 2010.
[2] U. Brandes et al., “On modularity clustering,” IEEE Trans. Knowl. Data
Eng., vol. 20, no. 2, pp. 172–188, Feb. 2008.
[3] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast
unfolding of communities in large networks,” J. Statist. Mech. Theory
Experiment, vol. 2008, no. 10, 2008, Art. no. P 10008.
[4] P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti, “Generalized lou-
vain method for community detection in large networks,” in Proc. 11th
Int. Conf. Intell. Syst. Des. Appl., 2011, pp. 88–93.
[5] M. E. Newman, “Fast algorithm for detecting community structure in
networks,” Phys. Rev. E, vol. 69, no. 6, 2004, Art. no. 066133.
[6] A. Clauset, M. E. Newman, and C. Moore, “Finding community structure in
very large networks,” Phys. Rev. E, vol. 70, no. 6, 2004, Art. no. 066111.
[7] L. Waltman and N. J. Van Eck, “A smart local moving algorithm for
large-scale modularity-based community detection,” Eur. Phys. J. B,
vol. 86, no. 11, pp. 1–14, 2013.
[8] J. Xie, B. K. Szymanski, and X. Liu, “SLPA: Uncovering overlapping
communities in social networks via a speaker-listener interaction
dynamic process,” in Proc. IEEE 11th Int. Conf. Data Mining Work-
shops, 2011, pp. 344–349.
[9] M. Rosvall and C. T. Bergstrom, “Maps of information flow reveal
community structure in complex networks,” Proc. Nat. Acad. Sci. USA,
vol. 105, pp. 1118–1123, 2008.
[10] A. Kalyanaraman et al., “Fast uncovering of graph communities on a
chip: Toward scalable community detection on multicore and manycore
platforms,” Foundations Trends Electron. Des. Automat., vol. 10, no. 3,
pp. 145–247, 2016.
[11] R. Cazabet and F. Amblard, “Dynamic community detection,” Encyclo-
pedia Social Netw. Anal. Mining, Springer, pp. 404–414, 2014.
[12] F. Liu, J. Wu, S. Xue, C. Zhou, J. Yang, and Q. Sheng, “Detecting the
evolving community structure in dynamic social networks,” World Wide
Web, vol. 23, no. 2, pp. 715–733, 2020.
[13] P. Holme and J. Saram€
aki, “Temporal networks,” Phys. Rep., vol. 519,
no. 3, pp. 97–125, 2012.
[14] W. H. Chong and L. N. Teow, “An incremental batch technique for
community detection,” in Proc. 16th Int. Conf. Inf. Fusion, 2013,
pp. 750–757.
[15] R. Aktunc, I. H. Toroslu, M. Ozer, and H. Davulcu, “A dynamic modu-
larity based community detection algorithm for large-scale networks:
DSLM,” in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining,
2015, pp. 1177–1183.
[16] D. Greene, D. Doyle, and P. Cunningham, “Tracking the evolution of
communities in dynamic social networks,” in Proc. Int. Conf. Adv.
Social Netw. Anal. Mining, 2010, pp. 176–183.
[17] A. Zakrzewska and D. A. Bader, “A dynamic algorithm for local com-
munity detection in graphs,” in Proc. IEEE/ACM Int. Conf. Adv. Social
Netw. Anal. Mining, 2015, pp. 559–564.
[18] P. Agarwal, R. Verma, A. Agarwal, and T. Chakraborty, “Dyperm: Max-
imizing permanence for dynamic community detection,” in Proc.
Pacific-Asia Conf. Knowl. Discov. Data Mining. Berlin, Germany:
Springer, 2018, pp. 437–449.
[19] J. Hopcroft, O. Khan, B. Kulis, and B. Selman, “Tracking evolving com-
munities in large linked networks,” Proc. Nat. Acad. Sci. USA, vol. 101,
no. suppl 1, pp. 5249–5253, 2004.
[20] A. Cuzzocrea, F. Folino, and C. Pizzuti, “DynamicNet: An effective and
efficient algorithm for supporting community evolution detection in
time-evolving information networks,” in Proc. 17th Int. Database Eng.
Appl. Symp., 2013, pp. 148–153.
[21] P. Maillard, R. G€
orke, C. Staudt, and D. Wagner, “Modularity-driven
clustering of dynamic graphs,” in Proc. Exp. Algorithms, 9th Int. Symp.,
SEA 2010. Berlin, Germany: Springer, 2009, pp. 436–448.
[22] G. Rossetti and R. Cazabet, “Community discovery in dynamic net-
works: A survey,” ACM Comput. Surv., vol. 51, no. 2, pp. 1–37,
2018.
[23] M. Takaffoli, “Community evolution in dynamic social networks-chal-
lenges and problems,” in Proc. IEEE 11th Int. Conf. Data Mining Work-
shops, 2011, pp. 1211–1214.
[24] S. Asur, S. Parthasarathy, and D. Ucar, “An event-based framework for
characterizing the evolutionary behavior of interaction graphs,” ACM
Trans. Knowl. Discov. From Data, vol. 3, no. 4, pp. 1–36, 2009.
[25] J. He, D. Chen, C. Sun, Y. Fu, and W. Li, “Efficient stepwise detection
of communities in temporal networks,” Phys. Stat. Mech. Appl.,
vol. 469, pp. 438–446, 2017.
[26] M. Seifikar, S. Farzi, and M. Barati, “C-blondel: An efficient louvain-
based dynamic community detection algorithm,” IEEE Trans. Comput.
Social Syst., vol. 7, no. 2, pp. 308–318, Apr. 2020.
[27] G. Palla, A.-L. Barab
asi, and T. Vicsek, “Quantifying social group
evolution,” Nature, vol. 446, no. 7136, pp. 664–667, 2007.
[28] I. Der
enyi, G. Palla, and T. Vicsek, “Clique percolation in random
networks,” Phys. Rev. Lett., vol. 94, no. 16, 2005, Art. no. 160202.
[29] J. Xie, M. Chen, and B. K. Szymanski, “LabelRankT: Incremental com-
munity detection in dynamic networks via label propagation,” in Proc.
Workshop Dyn. Netw. Manage. Mining, 2013, pp. 25–32.
[30] M. Seifi and J.-L. Guillaume, “Community cores in evolving networks,”
in Proc. 21st Int. Conf. World Wide Web, 2012, pp. 1173–1180.
[31] Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. L. Tseng,
“Facetnet: A framework for analyzing communities and their evolu-
tions in dynamic networks,” in Proc. 17th Int. Conf. World Wide
Web, 2008, pp. 685–694.
[32] S. Fortunato and M. Barthelemy, “Resolution limit in community
detection,” Proc. Nat. Acad. Sci. USA, vol. 104, no. 1, pp. 36–41,
2007.
1628 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 8, NO. 2, APRIL-JUNE 2021
16. [33] A. Lancichinetti and S. Fortunato, “Limits of modularity maximization
in community detection,” Phys. Rev. E, vol. 84, no. 6, 2011, Art.
no. 066122.
[34] T. Chakraborty, S. Srinivasan, N. Ganguly, A. Mukherjee, and
S. Bhowmick, “On the permanence of vertices in network communities,”
in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining,
2014, pp. 1396–1405.
[35] X. Zeng, W. Wang, C. Chen, and G. G. Yen, “A consensus community-
based particle swarm optimization for dynamic community detection,”
IEEE Trans. Cybern., vol. 50, no. 6, pp. 2502–2513, Jun. 2020.
[36] I. Messaoudi and N. Kamel, “A multi-objective bat algorithm for com-
munity detection on dynamic social networks,” Appl. Intell., vol. 49,
no. 6, pp. 2119–2136, 2019.
[37] Z. Zhao, C. Li, X. Zhang, F. Chiclana, and E. H. Viedma, “An incremen-
tal method to detect communities in dynamic evolving social networks,”
Knowl.-Based Syst., vol. 163, pp. 404–415, 2019.
[38] L. Wu, Q. Zhang, K. Guo, E. Chen, and C. Xu, “Dynamic community
detection method based on an improved evolutionary matrix,” Concur-
rency Computat.: Pract. Exper., vol. 33, p. e5314, 2021, doi: 10.1002/
cpe.5314.
[39] D. Zhuang, M. J. Chang, and M. Li, “Dynamo: Dynamic community
detection by incrementally maximizing modularity,” IEEE Trans.
Knowl. Data Eng., vol. 33, no. 5, pp. 1934–1945, May 2021.
[40] C. Tantipathananandh, T. Berger-Wolf, and D. Kempe, “A frame-
work for community identification in dynamic social networks,” in
Proc. 13th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining,
2007, pp. 717–726.
[41] D. S. Bassett, M. A. Porter, N. F. Wymbs, S. T. Grafton, J. M. Carlson,
and P. J. Mucha, “Robust detection of dynamic community structure in
networks,” Chaos: Interdiscipl. J. Nonlinear Sci., vol. 23, no. 1, 2013,
Art. no. 013142.
[42] C. Matias and V. Miele, “Statistical clustering of temporal networks
through a dynamic stochastic block model,” 2015, arXiv:1506.07464.
[43] C. Matias, T. Rebafka, and F. Villers, “A semiparametric extension of
the stochastic block model for longitudinal networks,” Biometrika,
vol. 105, no. 3, pp. 665–680, 2018.
[44] K. S. Xu and A. O. Hero, “Dynamic stochastic blockmodels for time-
evolving social networks,” IEEE J. Sel. Topics Signal Process., vol. 8,
no. 4, pp. 552–562, Aug. 2014.
[45] T. Viard, M. Latapy, and C. Magnien, “Computing maximal cliques in
link streams,” Theor. Comput. Sci., vol. 609, pp. 245–252, 2016.
[46] A.-S. Himmel, H. Molter, R. Niedermeier, and M. Sorge, “Enumerating
maximal cliques in temporal graphs,” in Proc. IEEE/ACM Int. Conf.
Adv. Social Netw. Anal. Mining, 2016, pp. 337–344.
[47] Z. Bu, H.-J. Li, C. Zhang, J. Cao, A. Li, and Y. Shi, “Graph k-means
based on leader identification, dynamic game, and opinion dynamics,”
IEEE Trans. Knowl. Data Eng., vol. 32, no. 7, pp. 1348–1361, Jul. 2020.
[48] N. Zarayeneh and A. Kalyanaraman, “A fast and efficient incremental
approach toward dynamic community detection,” in Proc. 2019 IEEE/
ACM Int. Conf. Adv. Social Netw. Anal. Mining, 2019, pp. 9–16.
[49] H. Lu, M. Halappanavar, and A. Kalyanaraman, “Parallel heuristics for scal-
able community detection,” Parallel Comput., vol. 47, pp. 19–37, 2015.
[50] M. Girvan and M. E. Newman, “Community structure in social and bio-
logical networks,” Proc. Nat. Acad. Sci. USA, vol. 99, no. 12, pp. 7821–
7826, 2002.
[51] S. Gregory, “A fast algorithm to find overlapping communities in
networks,” in Proc. Joint Eur. Conf. Mach. Learn. Knowl. Discov. Data-
bases. Berlin, Germany: Springer, 2008, pp. 408–423.
[52] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme
for partitioning irregular graphs,” SIAM J. Sci. Comput., vol. 20, no. 1,
pp. 359–392, 1998.
[53] E. Kao et al., “Streaming graph challenge: Stochastic block partition,” in
Proc. IEEE High Perform. Extreme Comput. Conf., 2017, pp. 1–12.
[54] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network data-
set collection,” 2014. [Online]. Available: http://snap.stanford.edu/data
Neda Zarayeneh (Member, IEEE) received the B.S.
degree in applied mathematics from Razi University,
Kermanshah, Iran, the M.S. degree in applied mathe-
matics from the University of Tehran, Tehran, Iran,
and the M.S. degree in computer science from Texas
AM University-Commerce, Commerce, TX, USA.
She is currently working toward the Ph.D. degree in
computer science with Washington State University,
Pullman, WA, USA. Her research interests include
graph algorithms, network and data science, and
computational sciences.
Ananth Kalyanaraman (Senior Member, IEEE)
received the bachelor’s degree from the Visvesvaraya
National Institute of Technology, Nagpur, India, in
1998, and the M.S. and Ph.D. degrees from Iowa
State University, Ames, IA, USA, in 2002 and 2006,
respectively. He is currently a Professor and the Boe-
ing Centennial Chair of computer science with the
School of Electrical Engineering and Computer Sci-
ence, Washington State University, Pullman, WA,
USA. He also holds a joint appointment with Pacific
Northwest National Laboratory, Richland, WA,
USA. His research focuses on developing parallel algorithms and software for
data-intensive problems originating in the areas of computational biology and
graph-theoretic applications. He is on the Editorial Boards of IEEE TRANSAC-
TIONS ON PARALLEL AND DISTRIBUTED SYSTEMS and the Journal of Parallel and
Distributed Computing. He is a Member of ACM and SIAM. He was the recip-
ient of the DOE Early Career Award, the Early Career Impact Award, and two
best paper awards.
ZARAYENEH AND KALYANARAMAN: DELTA-SCREENING: A FAST AND EFFICIENT TECHNIQUE TO UPDATE COMMUNITIES IN DYNAMIC GRAPHS 1629