•Download as PPTX, PDF•

1 like•1,660 views

Scalable community detection with the Louvain algorithm Navid Sedighpour AmirKabir University of Technology July 2016

Report

Share

Report

Share

Community detection in graphs

Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.

06 Community Detection

Community detection algorithms are used to identify densely connected groups of nodes in networks. Modularity optimization is commonly used, which detects communities as groups of nodes with more connections within groups than expected by chance. Parameters like resolution affect results. Multilayer networks model systems with multiple network layers over nodes. Multilayer modularity generalizes modularity to multilayer networks. Community detection in multilayer networks provides insights into structures across data types and applications.

Community detection algorithms

This presentation about Artificial intelligence Algorithms
and find good methods for detect communitiy

Community Detection

Community detection from research papers (AAN dataset) using the algorithms:
K-Means
Louvain
Newman-Girvan
github link to code: https://goo.gl/CXej44
github link to project web page: http://goo.gl/7OOkhI
youtube link to video:https://goo.gl/SCpamf
dropbox link to ppt report video: https://goo.gl/cgACzU

Community Detection in Social Networks: A Brief Overview

The document provides an overview of community detection in social networks. It discusses that networks are found everywhere where there are interactions between actors. It then motivates the importance of detecting communities by explaining that communities are groups of nodes that likely share properties and roles. Detecting communities has applications like improving recommendation systems and parallel computing. It also justifies the existence of communities in real networks using the concept of homophily where similar actors tend to connect. The document then discusses different approaches to detecting communities including Girvan-Newman algorithm based on edge betweenness and Louvain method which uses greedy modularity optimization.

Community Detection in Social Media

This document discusses community detection in social media and online networks. It defines communities as groups of densely interconnected nodes in a graph. It outlines various algorithms for detecting communities, including graph partitioning, k-clique detection, core decomposition, divisive algorithms based on edge centrality, and modularity maximization approaches. It also discusses local community detection methods and evaluation of community detection results.

Social Network Analysis

This document provides an overview of social network analysis (SNA) including concepts, methods, and applications. It begins with background on how SNA originated from social science and network analysis/graph theory. Key concepts discussed include representing social networks as graphs, identifying strong and weak ties, central nodes, and network cohesion. Practical applications of SNA are also outlined, such as in business, law enforcement, and social media sites. The document concludes by recommending when and why to use SNA.

Community detection in social networks

University of Porto, Faculty of Engineering, 2014-09-23.
Seminar on Intelligent Systems, Interaction and Multimedia.

Community detection in graphs

Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.

06 Community Detection

Community detection algorithms are used to identify densely connected groups of nodes in networks. Modularity optimization is commonly used, which detects communities as groups of nodes with more connections within groups than expected by chance. Parameters like resolution affect results. Multilayer networks model systems with multiple network layers over nodes. Multilayer modularity generalizes modularity to multilayer networks. Community detection in multilayer networks provides insights into structures across data types and applications.

Community detection algorithms

This presentation about Artificial intelligence Algorithms
and find good methods for detect communitiy

Community Detection

Community detection from research papers (AAN dataset) using the algorithms:
K-Means
Louvain
Newman-Girvan
github link to code: https://goo.gl/CXej44
github link to project web page: http://goo.gl/7OOkhI
youtube link to video:https://goo.gl/SCpamf
dropbox link to ppt report video: https://goo.gl/cgACzU

Community Detection in Social Networks: A Brief Overview

The document provides an overview of community detection in social networks. It discusses that networks are found everywhere where there are interactions between actors. It then motivates the importance of detecting communities by explaining that communities are groups of nodes that likely share properties and roles. Detecting communities has applications like improving recommendation systems and parallel computing. It also justifies the existence of communities in real networks using the concept of homophily where similar actors tend to connect. The document then discusses different approaches to detecting communities including Girvan-Newman algorithm based on edge betweenness and Louvain method which uses greedy modularity optimization.

Community Detection in Social Media

This document discusses community detection in social media and online networks. It defines communities as groups of densely interconnected nodes in a graph. It outlines various algorithms for detecting communities, including graph partitioning, k-clique detection, core decomposition, divisive algorithms based on edge centrality, and modularity maximization approaches. It also discusses local community detection methods and evaluation of community detection results.

Social Network Analysis

This document provides an overview of social network analysis (SNA) including concepts, methods, and applications. It begins with background on how SNA originated from social science and network analysis/graph theory. Key concepts discussed include representing social networks as graphs, identifying strong and weak ties, central nodes, and network cohesion. Practical applications of SNA are also outlined, such as in business, law enforcement, and social media sites. The document concludes by recommending when and why to use SNA.

Community detection in social networks

University of Porto, Faculty of Engineering, 2014-09-23.
Seminar on Intelligent Systems, Interaction and Multimedia.

Introduction to Complex Networks

This document introduces complex networks and the R programming language. It defines what networks are, provides examples of complex networks, and discusses key metrics used to characterize complex networks such as degree distribution, diameter, and clustering coefficient. It then introduces R as a programming language for statistical computing and graphics. The document discusses installing R packages and features of R including being open source and operating system independent. It also introduces the iGraph package for manipulating graphs in R and provides examples of creating and analyzing a random graph.

Community Detection with Networkx

Clustering Methods and Community Detection with NetworkX. A slide deck for the NTU Complexity Science Winter School.
For the accompanying iPython Notebook, visit: http://github.com/eflegara/NetStruc

Group and Community Detection in Social Networks

1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?

Introduction to Social Network Analysis

Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities. SNA provides both a visual and a mathematical analysis of human relationships.

Homophily and influence in social networks

Brief tutorial on Influence and Homophily in social networks. Key concepts. How to distinguish influence from correlation. Information diffusion processes. Influence Maximization Problem
and viral marketing.

Clustering

This document discusses different types of clustering techniques used in machine learning. It describes clustering as an unsupervised learning method that groups similar data points together. The main types covered are hard clustering, soft clustering, density-based clustering, hierarchical clustering, and partitioning-based clustering like k-means. K-means and hierarchical clustering are explained in more detail, including examples of how they work. Agglomerative and divisive hierarchical clustering are two approaches defined.

Vanishing & Exploding Gradients

Vanishing gradients occur when error gradients become very small during backpropagation, hindering convergence. This can happen when activation functions like sigmoid and tanh are used, as their derivatives are between 0 and 0.25. It affects earlier layers more due to more multiplicative terms. Using ReLU activations helps as their derivative is 1 for positive values. Initializing weights properly also helps prevent vanishing gradients. Exploding gradients occur when error gradients become very large, disrupting learning. It can be addressed through lower learning rates, gradient clipping, and gradient scaling.

Presentation on K-Means Clustering

This presentation introduces clustering analysis and the k-means clustering technique. It defines clustering as an unsupervised method to segment data into groups with similar traits. The presentation outlines different clustering types (hard vs soft), techniques (partitioning, hierarchical, etc.), and describes the k-means algorithm in detail through multiple steps. It discusses requirements for clustering, provides examples of applications, and reviews advantages and disadvantages of k-means clustering.

Hierarchical Clustering

Part of the course "Algorithmic Methods of Data Science". Sapienza University of Rome, 2015.
http://aris.me/index.php/data-mining-ds-2015

Generative adversarial text to image synthesis

Slides by Víctor Garcia about the paper:
Reed, Scott, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. "Generative adversarial text to image synthesis." ICML 2016.

Major issues in data mining

Major issues in data mining include developing methods to mine various types of knowledge from multidimensional and uncertain data, improving user interaction through query languages and visualization techniques, enhancing efficiency and scalability with parallel algorithms, adapting techniques for complex and dynamic data sources, and addressing privacy and social impact concerns.

B trees in Data Structure

B-Trees are tree data structures used to store data on disk storage. They allow for efficient retrieval of data compared to binary trees when using disk storage due to reduced height. B-Trees group data into nodes that can have multiple children, reducing the height needed compared to binary trees. Keys are inserted by adding to leaf nodes or splitting nodes and promoting middle keys. Deletion involves removing from leaf nodes, borrowing/promoting keys, or joining nodes.

Graph Based Clustering

The document discusses graph-based clustering methods. It describes how graphs can be used to represent real-world networks from domains like biology, technology, social networks, and economics. It introduces the idea of using minimal spanning trees and hierarchical clustering to identify clusters in graph data. Two common algorithms for finding minimal spanning trees are described: Prim's algorithm and Kruskal's algorithm. Different strategies for iteratively deleting branches from the minimal spanning tree are also summarized to form clusters, such as deleting the branch with the maximum weight or inconsistent branches based on a reference value.

High Dimensional Data Visualization using t-SNE

Review of the t-SNE algorithm which helps visualizing the high dimensional data on manifold by projecting them onto 2D or 3D space with metric preserving.

Divide and Conquer - Part 1

Divide and Conquer Algorithms - D&C forms a distinct algorithm design technique in computer science, wherein a problem is solved by repeatedly invoking the algorithm on smaller occurrences of the same problem. Binary search, merge sort, Euclid's algorithm can all be formulated as examples of divide and conquer algorithms. Strassen's algorithm and Nearest Neighbor algorithm are two other examples.

Capter10 cluster basic : Han & Kamber

The document describes different types of clustering algorithms, including partitioning, hierarchical, density-based, and grid-based methods. Partitioning methods like k-means and k-medoids aim to partition objects into k clusters by optimizing an objective function. Hierarchical clustering builds a hierarchy of clusters based on distance, either through an agglomerative (bottom-up) or divisive (top-down) approach. Density-based methods identify clusters based on density rather than distance. Grid-based methods quantize the space into a finite number of cells that form a grid structure.

Social network analysis part ii

The document discusses concepts in social network analysis including measuring networks through embedding measures and positions/roles of nodes. It covers network measures such as reciprocity, transitivity, clustering, density, and the E-I index. It also discusses positions like structural equivalence and regular equivalence and how to compute positional similarity through adjacency matrices.

Statistics and Data Mining

This document discusses data preprocessing techniques for data mining. It explains that real-world data is often dirty, containing issues like missing values, noise, and inconsistencies. Major tasks in data preprocessing include data cleaning, integration, transformation, reduction, and discretization. Data cleaning techniques are especially important and involve filling in missing values, identifying and handling outliers, resolving inconsistencies, and reducing redundancy from data integration. Other techniques discussed include binning data for smoothing noisy values and handling missing data through various imputation methods.

Data Structures - Lecture 10 [Graphs]

Graphs are data structures consisting of nodes and edges connecting nodes. They can be directed or undirected. Trees are special types of graphs. Common graph algorithms include depth-first search (DFS) and breadth-first search (BFS). DFS prioritizes exploring nodes along each branch as deeply as possible before backtracking, using a stack. BFS explores all nodes at the current depth before moving to the next depth, using a queue.

Clique

CLIQUE is a grid-based clustering algorithm that identifies dense units in subspaces of high-dimensional data to provide efficient clustering. It works by first partitioning each attribute dimension into equal intervals and then the data space into rectangular grid cells. It finds dense units in subspaces like planes and intersections them to identify dense units in higher dimensions. These dense units are grouped into clusters. CLIQUE scales linearly with size of data and number of dimensions and automatically identifies relevant subspaces for clustering. However, the clustering accuracy may be reduced for simplicity.

email transaction network analysis

Thesis defense presentation - SDSU Computational Sciences, 2013. Use a number of network analysis tools, including community detection, pagerank, eigenvector centrality, etc., to determine key metrics of graphs determined by key word searches. These custom graphs and the associated metrics are then presented in interactive graphics and tables.

Graph Community Detection Algorithm for Distributed Memory Parallel Computing...

Community detection is an important problem that spans many research areas, such as social networks, systems biology, power grid optimization. The fine-grained communication and irregular access pattern to memory and interconnect limit the overall scalability and performance of existing algorithms. This talk presents a highly scalable parallel algorithm for distributed memory systems. The method employs a novel implementation strategy to store and process dynamic graphs. The scalability analysis of the algorithm was performed using two massively parallel machines, Blue Gene/Q and Power7-IH, for graphs with up to hundreds of billions of edges. Leveraging the convergence properties of the algorithm and the efficient implementation, it is possible to analyze communities of large-scale graphs in just a few seconds. The talk is based on a paper accepted for publication in IPDPS-2015 conference proceedings that was kindly provided by Dr. Fabrizio Petrini (IBM Research).

Introduction to Complex Networks

This document introduces complex networks and the R programming language. It defines what networks are, provides examples of complex networks, and discusses key metrics used to characterize complex networks such as degree distribution, diameter, and clustering coefficient. It then introduces R as a programming language for statistical computing and graphics. The document discusses installing R packages and features of R including being open source and operating system independent. It also introduces the iGraph package for manipulating graphs in R and provides examples of creating and analyzing a random graph.

Community Detection with Networkx

Clustering Methods and Community Detection with NetworkX. A slide deck for the NTU Complexity Science Winter School.
For the accompanying iPython Notebook, visit: http://github.com/eflegara/NetStruc

Group and Community Detection in Social Networks

1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?

Introduction to Social Network Analysis

Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities. SNA provides both a visual and a mathematical analysis of human relationships.

Homophily and influence in social networks

Brief tutorial on Influence and Homophily in social networks. Key concepts. How to distinguish influence from correlation. Information diffusion processes. Influence Maximization Problem
and viral marketing.

Clustering

This document discusses different types of clustering techniques used in machine learning. It describes clustering as an unsupervised learning method that groups similar data points together. The main types covered are hard clustering, soft clustering, density-based clustering, hierarchical clustering, and partitioning-based clustering like k-means. K-means and hierarchical clustering are explained in more detail, including examples of how they work. Agglomerative and divisive hierarchical clustering are two approaches defined.

Vanishing & Exploding Gradients

Vanishing gradients occur when error gradients become very small during backpropagation, hindering convergence. This can happen when activation functions like sigmoid and tanh are used, as their derivatives are between 0 and 0.25. It affects earlier layers more due to more multiplicative terms. Using ReLU activations helps as their derivative is 1 for positive values. Initializing weights properly also helps prevent vanishing gradients. Exploding gradients occur when error gradients become very large, disrupting learning. It can be addressed through lower learning rates, gradient clipping, and gradient scaling.

Presentation on K-Means Clustering

This presentation introduces clustering analysis and the k-means clustering technique. It defines clustering as an unsupervised method to segment data into groups with similar traits. The presentation outlines different clustering types (hard vs soft), techniques (partitioning, hierarchical, etc.), and describes the k-means algorithm in detail through multiple steps. It discusses requirements for clustering, provides examples of applications, and reviews advantages and disadvantages of k-means clustering.

Hierarchical Clustering

Part of the course "Algorithmic Methods of Data Science". Sapienza University of Rome, 2015.
http://aris.me/index.php/data-mining-ds-2015

Generative adversarial text to image synthesis

Slides by Víctor Garcia about the paper:
Reed, Scott, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. "Generative adversarial text to image synthesis." ICML 2016.

Major issues in data mining

Major issues in data mining include developing methods to mine various types of knowledge from multidimensional and uncertain data, improving user interaction through query languages and visualization techniques, enhancing efficiency and scalability with parallel algorithms, adapting techniques for complex and dynamic data sources, and addressing privacy and social impact concerns.

B trees in Data Structure

B-Trees are tree data structures used to store data on disk storage. They allow for efficient retrieval of data compared to binary trees when using disk storage due to reduced height. B-Trees group data into nodes that can have multiple children, reducing the height needed compared to binary trees. Keys are inserted by adding to leaf nodes or splitting nodes and promoting middle keys. Deletion involves removing from leaf nodes, borrowing/promoting keys, or joining nodes.

Graph Based Clustering

The document discusses graph-based clustering methods. It describes how graphs can be used to represent real-world networks from domains like biology, technology, social networks, and economics. It introduces the idea of using minimal spanning trees and hierarchical clustering to identify clusters in graph data. Two common algorithms for finding minimal spanning trees are described: Prim's algorithm and Kruskal's algorithm. Different strategies for iteratively deleting branches from the minimal spanning tree are also summarized to form clusters, such as deleting the branch with the maximum weight or inconsistent branches based on a reference value.

High Dimensional Data Visualization using t-SNE

Review of the t-SNE algorithm which helps visualizing the high dimensional data on manifold by projecting them onto 2D or 3D space with metric preserving.

Divide and Conquer - Part 1

Divide and Conquer Algorithms - D&C forms a distinct algorithm design technique in computer science, wherein a problem is solved by repeatedly invoking the algorithm on smaller occurrences of the same problem. Binary search, merge sort, Euclid's algorithm can all be formulated as examples of divide and conquer algorithms. Strassen's algorithm and Nearest Neighbor algorithm are two other examples.

Capter10 cluster basic : Han & Kamber

The document describes different types of clustering algorithms, including partitioning, hierarchical, density-based, and grid-based methods. Partitioning methods like k-means and k-medoids aim to partition objects into k clusters by optimizing an objective function. Hierarchical clustering builds a hierarchy of clusters based on distance, either through an agglomerative (bottom-up) or divisive (top-down) approach. Density-based methods identify clusters based on density rather than distance. Grid-based methods quantize the space into a finite number of cells that form a grid structure.

Social network analysis part ii

The document discusses concepts in social network analysis including measuring networks through embedding measures and positions/roles of nodes. It covers network measures such as reciprocity, transitivity, clustering, density, and the E-I index. It also discusses positions like structural equivalence and regular equivalence and how to compute positional similarity through adjacency matrices.

Statistics and Data Mining

This document discusses data preprocessing techniques for data mining. It explains that real-world data is often dirty, containing issues like missing values, noise, and inconsistencies. Major tasks in data preprocessing include data cleaning, integration, transformation, reduction, and discretization. Data cleaning techniques are especially important and involve filling in missing values, identifying and handling outliers, resolving inconsistencies, and reducing redundancy from data integration. Other techniques discussed include binning data for smoothing noisy values and handling missing data through various imputation methods.

Data Structures - Lecture 10 [Graphs]

Graphs are data structures consisting of nodes and edges connecting nodes. They can be directed or undirected. Trees are special types of graphs. Common graph algorithms include depth-first search (DFS) and breadth-first search (BFS). DFS prioritizes exploring nodes along each branch as deeply as possible before backtracking, using a stack. BFS explores all nodes at the current depth before moving to the next depth, using a queue.

Clique

CLIQUE is a grid-based clustering algorithm that identifies dense units in subspaces of high-dimensional data to provide efficient clustering. It works by first partitioning each attribute dimension into equal intervals and then the data space into rectangular grid cells. It finds dense units in subspaces like planes and intersections them to identify dense units in higher dimensions. These dense units are grouped into clusters. CLIQUE scales linearly with size of data and number of dimensions and automatically identifies relevant subspaces for clustering. However, the clustering accuracy may be reduced for simplicity.

Introduction to Complex Networks

Introduction to Complex Networks

Community Detection with Networkx

Community Detection with Networkx

Group and Community Detection in Social Networks

Group and Community Detection in Social Networks

Introduction to Social Network Analysis

Introduction to Social Network Analysis

Homophily and influence in social networks

Homophily and influence in social networks

Clustering

Clustering

Vanishing & Exploding Gradients

Vanishing & Exploding Gradients

Presentation on K-Means Clustering

Presentation on K-Means Clustering

Hierarchical Clustering

Hierarchical Clustering

Generative adversarial text to image synthesis

Generative adversarial text to image synthesis

Major issues in data mining

Major issues in data mining

B trees in Data Structure

B trees in Data Structure

Graph Based Clustering

Graph Based Clustering

High Dimensional Data Visualization using t-SNE

High Dimensional Data Visualization using t-SNE

Divide and Conquer - Part 1

Divide and Conquer - Part 1

Capter10 cluster basic : Han & Kamber

Capter10 cluster basic : Han & Kamber

Social network analysis part ii

Social network analysis part ii

Statistics and Data Mining

Statistics and Data Mining

Data Structures - Lecture 10 [Graphs]

Data Structures - Lecture 10 [Graphs]

Clique

Clique

email transaction network analysis

Thesis defense presentation - SDSU Computational Sciences, 2013. Use a number of network analysis tools, including community detection, pagerank, eigenvector centrality, etc., to determine key metrics of graphs determined by key word searches. These custom graphs and the associated metrics are then presented in interactive graphics and tables.

Graph Community Detection Algorithm for Distributed Memory Parallel Computing...

Community detection is an important problem that spans many research areas, such as social networks, systems biology, power grid optimization. The fine-grained communication and irregular access pattern to memory and interconnect limit the overall scalability and performance of existing algorithms. This talk presents a highly scalable parallel algorithm for distributed memory systems. The method employs a novel implementation strategy to store and process dynamic graphs. The scalability analysis of the algorithm was performed using two massively parallel machines, Blue Gene/Q and Power7-IH, for graphs with up to hundreds of billions of edges. Leveraging the convergence properties of the algorithm and the efficient implementation, it is possible to analyze communities of large-scale graphs in just a few seconds. The talk is based on a paper accepted for publication in IPDPS-2015 conference proceedings that was kindly provided by Dr. Fabrizio Petrini (IBM Research).

DIMACS10: Parallel Community Detection for Massive Graphs

The document discusses parallel community detection algorithms for massive graphs. It introduces the problem of community detection in large graphs and outlines challenges for parallel algorithms. It describes a traditional sequential agglomerative approach that is prone to quadratic runtime. The document then proposes a parallel agglomerative method using maximal matchings to merge communities in parallel while maintaining balance. Implementation details are provided for the data structures and basic routines for scoring, matching and contracting communities. Performance results show timings on different graphs using an Intel server and Cray XMT2 supercomputer.

Community Detection in Brain Networks

This document provides a project update on analyzing resting fMRI time series data to identify time-evolving brain communities and clusters between normal and patient groups. It discusses applying maximum overlap discrete wavelet transform to the 117x140 time series data matrix to generate a 117x117 correlation matrix. Different experimental conditions are proposed to evaluate clusters, including sampling frequency, correlation threshold, and centrality measures. Preliminary results show 7 small connected communities identified after sampling, making time-evolving clusters easier to assess than from a single large community.

Learning and comparing multi-subject models of brain functional connecitivity

High-level brain function arises through functional interactions. These can be mapped via co-fluctuations in activity observed in functional imaging.
First, I first how spatial maps characteristic of on-going activity in a population of subjects can be learned using multi-subject decomposition models extending the popular Independent Component Analysis. These methods single out spatial atoms of brain activity: functional networks or brain regions. With a probabilistic model of inter-subject variability, they open the door to building data-driven atlases of on-going activity.
Subsequently, I discuss graphical modeling of the interactions between brain regions. To learn highly-resolved large scale individual
graphical models models, we use sparsity-inducing penalizations introducing a population prior that mitigates the data scarcity at the subject-level. The corresponding graphs capture better the community structure of brain activity than single-subject models or group averages.
Finally, I address the detection of connectivity differences between subjects. Explicit group variability models of the covariance structure can be used to build optimal edge-level test statistics. On stroke patients resting-state data, these models detect patient-specific functional connectivity perturbations.

Distributed computing environment

DCE is an architecture defined by OSF to provide a distributed computing platform. It includes services like RPC, directory service, security, time, and file service. DCE defines a framework for client-server communication and developing distributed applications across networked computers. It aims to address challenges of distributed computing like scalability, availability and security.

Webinar: RDBMS to Graphs

Relational databases were conceived to digitize paper forms and automate well-structured business processes, and still have their uses. But RDBMS cannot model or store data and its relationships without complexity, which means performance degrades with the increasing number and levels of data relationships and data size. Additionally, new types of data and data relationships require schema redesign that increases time to market.
A native graph database like Neo4j naturally stores, manages, analyzes, and uses data within the context of connections meaning Neo4j provides faster query performance and vastly improved flexibility in handling complex hierarchies than SQL.
This webinar explains why companies are shifting away from RDBMS towards graphs to unlock the business value in their data relationships.

email transaction network analysis

email transaction network analysis

Graph Community Detection Algorithm for Distributed Memory Parallel Computing...

Graph Community Detection Algorithm for Distributed Memory Parallel Computing...

DIMACS10: Parallel Community Detection for Massive Graphs

DIMACS10: Parallel Community Detection for Massive Graphs

Community Detection in Brain Networks

Community Detection in Brain Networks

Learning and comparing multi-subject models of brain functional connecitivity

Learning and comparing multi-subject models of brain functional connecitivity

Distributed computing environment

Distributed computing environment

Webinar: RDBMS to Graphs

Webinar: RDBMS to Graphs

Composition of Clans for Solving Linear Systems on Parallel Architectures

This document discusses a method for solving linear systems of equations called composition of clans. It involves decomposing the system into subgroups of related equations called clans, solving each clan separately, and then combining the solutions. This approach can provide speed-up when implemented in parallel on multiple computing nodes. The document provides examples and discusses software called ParAd that implements parallel composition of clans. It concludes that speed-ups of up to 180 times have been achieved on problems with this approach.

Graph Analysis Beyond Linear Algebra

High-performance graph analysis is unlocking knowledge in computer security, bioinformatics, social networks, and many other data integration areas. Graphs provide a convenient abstraction for many data problems beyond linear algebra. Some problems map directly to linear algebra. Others, like community detection, look eerily similar to sparse linear algebra techniques. And then there are algorithms that strongly resist attempts at making them look like linear algebra. This talk will cover recent results with an emphasis on streaming graph problems where the graph changes and results need updated with minimal latency. We’ll also touch on issues of sensitivity and reliability where graph analysis needs to learn from numerical analysis and linear algebra.

Scalable Static and Dynamic Community Detection Using Grappolo : NOTES

A **community** (in a network) is a subset of nodes which are _strongly connected among themselves_, but _weakly connected to others_. Neither the number of output communities nor their size distribution is known a priori. Community detection methods can be divisive or agglomerative. **Divisive methods** use _betweeness centrality_ to **identify and remove bridges** between communities. **Agglomerative methods** greedily **merge two communities** that provide maximum gain in _modularity_. Newman and Girvan have introduced the **modularity metric**. The problem of community detection is then reduced to the problem of modularity maximization which is **NP-complete**. **Louvain method** is a variant of the _agglomerative strategy_, in that is a _multi-level heuristic_.
https://gist.github.com/wolfram77/917a1a4a429e89a0f2a1911cea56314d
In this paper, the authors discuss **four heuristics** for Community detection using the _Louvain algorithm_ implemented upon recently developed **Grappolo**, which is a parallel variant of the Louvain algorithm. They are:
- Vertex following and Minimum label
- Data caching
- Graph coloring
- Threshold scaling
With the **Vertex following** heuristic, the _input is preprocessed_ and all single-degree vertices are merged with their corresponding neighbours. This helps reduce the number of vertices considered in each iteration, and also help initial seeds of communities to be formed. With the **Minimum label heuristic**, when a vertex is making the decision to move to a community and multiple communities provided the same modularity gain, the community with the smallest id is chosen. This helps _minimize or prevent community swaps_. With the **Data caching** heuristic, community information is stored in a vector instead of a map, and is reused in each iteration, but with some additional cost. With the **Vertex ordering via Graph coloring** heuristic, _distance-k coloring_ of graphs is performed in order to group vertices into colors. Then, each set of vertices (by color) is processed _concurrently_, and synchronization is performed after that. This enables us to mimic the behaviour of the serial algorithm. Finally, with the **Threshold scaling** heuristic, _successively smaller values of modularity threshold_ are used as the algorithm progresses. This allows the algorithm to converge faster, and it has been observed a good modularity score as well.
From the results, it appears that _graph coloring_ and _threshold scaling_ heuristics do not always provide a speedup and this depends upon the nature of the graph. It would be interesting to compare the heuristics against baseline approaches. Future work can include _distributed memory implementations_, and _community detection on streaming graphs_.

201907 AutoML and Neural Architecture Search

Brief introduction of NAS
Review of EfficientNet (Google Brain), RandWire (FAIR) papers
NAS flow slide from KihoSuh's slideshare (https://www.slideshare.net/KihoSuh/neural-architecture-search-with-reinforcement-learning-76883153)
[References]
[1] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (https://arxiv.org/abs/1905.11946)
[2] Exploring Randomly Wired Neural Networks for Image Recognition (https://arxiv.org/abs/1904.01569)

A Dynamic Algorithm for Local Community Detection in Graphs : NOTES

**Community detection methods** can be *global* or *local*. **Global community detection methods** divide the entire graph into groups. Existing global algorithms include:
- Random walk methods
- Spectral partitioning
- Label propagation
- Greedy agglomerative and divisive algorithms
- Clique percolation
https://gist.github.com/wolfram77/b4316609265b5b9f88027bbc491f80b6
There is a growing body of work in *detecting overlapping communities*. **Seed set expansion** is a **local community detection method** where a relevant *seed vertices* of interest are picked and *expanded to form communities* surrounding them. The quality of each community is measured using a *fitness function*.
**Modularity** is a *fitness function* which compares the number of intra-community edges to the expected number in a random-null model. **Conductance** is another popular fitness score that measures the community cut or inter-community edges. Many *overlapping community detection* methods **use a modified ratio** of intra-community edges to all edges with atleast one endpoint in the community.
Andersen et al. use a **Spectral PageRank-Nibble method** which minimizes conductance and is formed by adding vertices in order of decreasing PageRank values. Andersen and Lang develop a **random walk approach** in which some vertices in the seed set may not be placed in the final community. Clauset gives a **greedy method** that *starts from a single vertex* and then iteratively adds neighboring vertices *maximizing the local modularity score*. Riedy et al. **expand multiple vertices** via maximizing modularity.
Several algorithms for **detecting global, overlapping communities** use a *greedy*, *agglomerative approach* and run *multiple separate seed set expansions*. Lancichinetti et al. run **greedy seed set expansions**, each with a *single seed vertex*. Overlapping communities are produced by a sequentially running expansions from a node not yet in a community. Lee et al. use **maximal cliques as seed sets**. Havemann et al. **greedily expand cliques**.
The authors of this paper discuss a dynamic approach for **community detection using seed set expansion**. Simply marking the neighbours of changed vertices is a **naive approach**, and has *severe shortcomings*. This is because *communities can split apart*. The simple updating method *may fail even when it outputs a valid community* in the graph.

09 placement

This document discusses placement in VLSI physical design. It defines placement as arranging modules on a layout surface with fixed shapes and pin locations. The key goals of placement are to minimize area, reduce critical net lengths, and ensure routability. Different placement problems are discussed at the system, board and chip levels. Common placement algorithms described include simulated annealing, genetic algorithms, and force-directed placement.

Community Detection on the GPU : NOTES

This paper discusses a GPU implementation of the Louvain community detection algorithm. Louvain algorithm obtains hierachical communities as a dendrogram through modularity optimization. Given an undirected weighted graph, all vertices are first considered to be their own communities. In the first phase, each vertex greedily decides to move to the community of one of its neighbours which gives greatest increase in modularity. If moving to no neighbour's community leads to an increase in modularity, the vertex chooses to stay with its own community. This is done sequentially for all the vertices. If the total change in modularity is more than a certain threshold, this phase is repeated. Once this local moving phase is complete, all vertices have formed their first hierarchy of communities. The next phase is called the aggregation phase, where all the vertices belonging to a community are collapsed into a single super-vertex, such that edges between communities are represented as edges between respective super-vertices (edge weights are combined), and edges within each community are represented as self-loops in respective super-vertices (again, edge weights are combined). Together, the local moving and the aggregation phases constitute a stage. This super-vertex graph is then used as input fof the next stage. This process continues until the increase in modularity is below a certain threshold. As a result from each stage, we have a hierarchy of community memberships for each vertex as a dendrogram.
Approaches to perform the Louvain algorithm can be divided into coarse-grained and fine-grained. Coarse-grained approaches process a set of vertices in parallel, while fine-grained approaches process all vertices in parallel. A coarse-grained hybrid-GPU algorithm using multi GPUs has be implemented by Cheong et al. which grabbed my attention. In addition, their algorithm does not use hashing for the local moving phase, but instead sorts each neighbour list based on the community id of each vertex.
https://gist.github.com/wolfram77/7e72c9b8c18c18ab908ae76262099329

Paper study: Learning to solve circuit sat

My own slides on meeting which presenting "Learning to Solve Circuit-SAT: an Unsupervised Differentiable Approach "

Multiplex Networks: structure and dynamics

This document discusses the formal representation and analysis of multiplex networks. It begins by introducing complex networks science and the concept of abstracting real-world systems as graphs to study structure and interactions. It then defines multiplex networks as networks with multiple types of interactions or relations between nodes that can be represented as multiple layer-graphs. The document provides formal definitions and representations of multiplex networks using concepts like participation graphs, layer-graphs, coupling graphs, and supra-adjacency matrices. It also discusses analyzing and coarse-graining multiplex networks through measures like structural metrics, walks, and quotient graphs.

240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...

This document proposes a new graph embedding algorithm called Walklets that explicitly learns multi-scale network representations. Walklets uses a "skipping" mechanism during random walks to capture structural information at different scales. It learns representations by optimizing a loss function via stochastic gradient descent. Evaluation on social networks shows Walklets outperforms baselines by better modeling multi-scale effects and scales to large graphs through sampling approximations.

Random graph models

This document summarizes three common random graph models used to model complex real-world networks: Erdos-Renyi graphs, Watts-Strogatz small-world networks, and Barabasi-Albert scale-free networks. Erdos-Renyi graphs use a simple random edge placement process that can result in disconnected graphs or ones with small clustering. Watts-Strogatz networks address this by rewiring edges in a lattice, creating small-world properties but not realistic degree distributions. Barabasi-Albert networks use a preferential attachment mechanism where new nodes attach preferentially to higher degree nodes, producing power-law degree distributions seen in many real networks.

Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018

Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.NS-CUK Seminar: H.B.Kim, Review on "Sequential Recommendation with Graph Neu...

NS-CUK Seminar: H.B.Kim, Review on "Sequential Recommendation with Graph Neural Networks", SIGIR 2021

Fast Incremental Community Detection on Dynamic Graphs : NOTES

In this paper, the authors describe two approaches for dynamic community detection using the CNM algorithm. CNM is a hierarchical, agglomerative algorithm that greedily maximizes modularity. They define two approaches: BasicDyn and FastDyn. BasicDyn backtracks merges of communities until each marked (changed) vertex is its own singleton community. FastDyn undoes a merge only if the quality of merge, as measured by the induced change in modularity, has significantly decreased compared to when the merge initially took place. FastDyn also allows more than two vertices to contract together if in the previous time step these vertices eventually ended up contracted in the same community. In the static case, merging several vertices together in one contraction phase could lead to deteriorating results. FastDyn is able to do this, however, because it uses information from the merges of the previous time step. Intuitively, merges that previously occurred are more likely to be acceptable later.
https://gist.github.com/wolfram77/1856b108334cc822cdddfdfa7334792a

generalized_nbody_acs_2015_challacombe

1. The document discusses barriers to scaling electronic structure methods to large systems, such as the inability of sparse matrix multiplication kernels to access strong parallel scaling and entrenched data structures that limit innovation.
2. It proposes a fast, generic, and data local N-body solver approach using new mathematics that is not constrained by row-column data structures and allows a single programming model.
3. Key aspects of this approach include exploiting locality in higher dimensional product volumes through techniques like occlusion-culling, resolving identity iteratively to compress matrices by orders of magnitude, and developing optimized sparse matrix multiplication kernels.

N41049093

International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.

Chap8 slides

This document summarizes several algorithms for parallel matrix operations, including matrix-vector multiplication, matrix-matrix multiplication, and solving systems of linear equations via Gaussian elimination. For matrix-vector multiplication, it describes row-wise and column-wise partitioning approaches. For matrix-matrix multiplication, it discusses algorithms based on row/column broadcasting, Cannon's algorithm, and a 3D domain decomposition approach. For Gaussian elimination, it analyzes pipelined and 2D mapping implementations. The key aspects of parallelization, communication costs, computation loads, scalability, and cost efficiency are analyzed for each algorithm.

Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...

Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.Quantum persistent k cores for community detection

PPT overview of paper accepted for 2019 Southeastern International Conference on Combinatorics, Graph Theory & Computing. Details a persistence approach to community detection and a new quantum persistence-based algorithm based on the coloring problem.

230727_HB_JointJournalClub.pptx

1) The document proposes a graph-based method using graph convolutional networks to address challenges in sequential recommendation, such as extracting implicit preferences from long behavior sequences and adapting to changing user preferences over time.
2) It constructs an interest graph from user behaviors and designs an attentive graph convolutional network and dynamic pooling technique to aggregate implicit signals into explicit preferences.
3) Experimental results on two large-scale datasets show the proposed method significantly outperforms state-of-the-art sequential recommendation methods.

Composition of Clans for Solving Linear Systems on Parallel Architectures

Composition of Clans for Solving Linear Systems on Parallel Architectures

Graph Analysis Beyond Linear Algebra

Graph Analysis Beyond Linear Algebra

Scalable Static and Dynamic Community Detection Using Grappolo : NOTES

Scalable Static and Dynamic Community Detection Using Grappolo : NOTES

201907 AutoML and Neural Architecture Search

201907 AutoML and Neural Architecture Search

A Dynamic Algorithm for Local Community Detection in Graphs : NOTES

A Dynamic Algorithm for Local Community Detection in Graphs : NOTES

09 placement

09 placement

Community Detection on the GPU : NOTES

Community Detection on the GPU : NOTES

Paper study: Learning to solve circuit sat

Paper study: Learning to solve circuit sat

Multiplex Networks: structure and dynamics

Multiplex Networks: structure and dynamics

240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...

240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...

Random graph models

Random graph models

Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018

Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018

NS-CUK Seminar: H.B.Kim, Review on "Sequential Recommendation with Graph Neu...

NS-CUK Seminar: H.B.Kim, Review on "Sequential Recommendation with Graph Neu...

Fast Incremental Community Detection on Dynamic Graphs : NOTES

Fast Incremental Community Detection on Dynamic Graphs : NOTES

generalized_nbody_acs_2015_challacombe

generalized_nbody_acs_2015_challacombe

N41049093

N41049093

Chap8 slides

Chap8 slides

Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...

Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...

Quantum persistent k cores for community detection

Quantum persistent k cores for community detection

230727_HB_JointJournalClub.pptx

230727_HB_JointJournalClub.pptx

Main Java[All of the Base Concepts}.docx

This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.

Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...

Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh

Chapter wise All Notes of First year Basic Civil Engineering.pptx

Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...National Information Standards Organization (NISO)

This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...

Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION

How to Create a More Engaging and Human Online Learning Experience

How to Create a More Engaging and Human Online Learning Experience Wahiba Chair Training & Consulting

Wahiba Chair's Talk at the 2024 Learning Ideas Conference. Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx

vvvvvvvvvvvvvvvvvvvvv

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx

Chapter 4 - Islamic Financial Institutions in Malaysia.pptxMohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia

This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
PIMS Job Advertisement 2024.pdf Islamabad

advasitment of Punjab

RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students

Physical pharmaceutics notes for B.pharm students

বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf

বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...

The Diamonds of 2023-2024 in the IGRA collection

A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.

Your Skill Boost Masterclass: Strategies for Effective Upskilling

Your Skill Boost Masterclass: Strategies for Effective UpskillingExcellence Foundation for South Sudan

Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.UGC NET Exam Paper 1- Unit 1:Teaching Aptitude

UGC NET Exam Paper 1- Unit 1:Teaching Aptitude

BBR 2024 Summer Sessions Interview Training

Qualitative research interview training by Professor Katrina Pritchard and Dr Helen Williams

PCOS corelations and management through Ayurveda.

This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.

Film vocab for eal 3 students: Australia the movie

film vocab esl

Advanced Java[Extra Concepts, Not Difficult].docx

This is part 2 of my Java Learning Journey. This contains Hashing, ArrayList, LinkedList, Date and Time Classes, Calendar Class and more.

Main Java[All of the Base Concepts}.docx

Main Java[All of the Base Concepts}.docx

Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...

Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...

Chapter wise All Notes of First year Basic Civil Engineering.pptx

Chapter wise All Notes of First year Basic Civil Engineering.pptx

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...

ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...

ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...

How to Create a More Engaging and Human Online Learning Experience

How to Create a More Engaging and Human Online Learning Experience

Digital Artefact 1 - Tiny Home Environmental Design

Digital Artefact 1 - Tiny Home Environmental Design

Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx

Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx

PIMS Job Advertisement 2024.pdf Islamabad

PIMS Job Advertisement 2024.pdf Islamabad

RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students

RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students

বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf

বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf

The Diamonds of 2023-2024 in the IGRA collection

The Diamonds of 2023-2024 in the IGRA collection

Your Skill Boost Masterclass: Strategies for Effective Upskilling

Your Skill Boost Masterclass: Strategies for Effective Upskilling

The basics of sentences session 6pptx.pptx

The basics of sentences session 6pptx.pptx

UGC NET Exam Paper 1- Unit 1:Teaching Aptitude

UGC NET Exam Paper 1- Unit 1:Teaching Aptitude

BBR 2024 Summer Sessions Interview Training

BBR 2024 Summer Sessions Interview Training

PCOS corelations and management through Ayurveda.

PCOS corelations and management through Ayurveda.

Film vocab for eal 3 students: Australia the movie

Film vocab for eal 3 students: Australia the movie

Advanced Java[Extra Concepts, Not Difficult].docx

Advanced Java[Extra Concepts, Not Difficult].docx

- 1. Scalable Community Detection with the Louvain Algorithm Navid Sedighpour Dr. Alireza Bagheri Advanced Algorithm Presentation July 2016
- 3. What we will see... • Introduction • Background • Parallel Louvain Algorithm • Experimental Results • Conclusions 3/ 36
- 5. Introduction Community Detection is an important problem today. Community detection spans many research areas: Health Care Social Networks Systems Biology Power Grid Community Detection algorithms attempt to identify: Modules Their hierarchical organization There is no rigorous mathematical definition of community structure. Modularity is the most used best known function to quantify community structure in graph. 5/ 36
- 6. Introduction Techniques used for modularity maximization: Greedy: applies different approaches to merge vertices into communities for higher modularity. Simulated Annealing: adopts a probabilistic procedure for global optimization on modularity. External Optimization: heuristic search procedure Spectral Optimization: uses the eigenvalues and eigenvectors of a special matrix for modularity optimization. All these methods lead to poor result when they applied to large graphs with many communities. Achieving strong scalability together with high-quality community detection is still an open problem. Unfortunately, most of community detection algorithms are sequential. 6/ 36
- 7. The Louvain Algorithm Is a greedy algorithm Weighted graphs This algorithm has been widely utilized in many application domains because of: Its rapid convergence properties High modularity Hierarchal partitioning 7/ 36
- 8. First Step: First Level Partitions 1. puts all vertices of a graph in distinct communities, one per vertex. 2. For each vertex i, the algorithm performs two calculations: 1. Compute the modularity gain (∆Q) when putting vertex i in the community of any neighbor j. 2. Pick the neighbor j that yields the largest gain in ∆Q . 3. This loop continues until no movement yields a gain. 8/ 36
- 9. Other Steps: hierarchal structure Partitions of the first step become super-vertices and the algorithm reconstructs the graph. Two super-vertices are connected if there is at least one edge between vertices of the corresponding partitions. The weight of the edge between the two super-vertices is the sum of the weights from all edges between their corresponding partitions at the lower level. These two steps repeats until the communities become stable. 9/ 36
- 10. Contributions Parallelization Novel implementation strategy to store and process dynamic graphs Extensive experimental evolution 10/ 36
- 12. Problem Definition • A weighted directed graph G = (V; E): • V — set of vertices • E — set of edges • when u, v ∈ V , an edge e(u, v) ∈ E has weight 𝑤 𝑢,𝑣 • The goal of community detection (in this paper) is to partition graph G into a set C of disjoint communities 𝑐𝑖: • ∪ 𝑐𝑖 = V ; ∀𝑐𝑖 ∈ C • 𝑐𝑖 ∩ 𝑐𝑗 = ∅ ; ∀𝑐𝑖; 𝑐𝑗 ∈ C 12/ 36
- 14. Modularity Gain • Modularity optimization is an NP-Complete problem, therefore the Louvain algorithm maximizes the modularity gain, which is : • W(u): the sum of the weights of the edges incident to vertex u • 𝑤 𝑢→𝑐 : the sum of the weights of the edges from vertex u to vertices in community c : 14/ 36
- 17. Hash-Based Data Organization • We use two distinct hash tables: • In_Table: to store the incoming edges of each vertex, that is not changed during the inner loop and keeps track of the original graph structure. • Out_Table: for the outgoing edges, which is changed during each iteration 17/ 36
- 19. Hash-Based Data Organization • Both hash-tables are hashed on edges • The input key (x) is a function of a tuple (t1,t2): • (<< is bitwise left shift operator and | is bitwise OR operator) • For In_Table, the tuple contains the source vertex u and the destination vertex i (i is the vertex 0 or 2) of the edge being hashed ((u,i), ∀e(u, i) ∈ E). • the bucket is a triple ((u,i), 𝑤 𝑢,𝑖), which includes the weight of the corresponding edges (𝑤 𝑢,𝑖) • For Out Table, the tuple contains the source vertex i and the destination vertex v’s community c for the corresponding edge ((i,c), ∀v∈c,e(i,v)∈E). • The bucket in the Out Table is a triple ((i,c), 𝑤𝑖→𝑐), where 𝑤𝑖→𝑐 refers to the sum of weights of all the edges from the vertex i to a specific community c because all these edges are hashed to the same bucket in the Out Table 19/ 36
- 20. Hash-Based Data Organization • In this paper, we have utilized Fibonacci hash: • Hash Function: • M is the size of the hash table • W is 264 - 1 • φ is the golden ratio 20/ 36
- 21. A Novel Convergence Heuristic LFR is designed to generate graphs to test community detection algorithms and mimic the properties of real-world, large-scale complex graphs, through tunable parameters. There is an inverse exponential relationship between the movement of the vertices and the number of iterations in the inner loop. Dynamical threshold : identifies the fraction of the vertices that need to be updated during each iteration of the inner loop. The threshold decreases exponentially with the number of iterations. 𝜺 21/ 36
- 22. Regression analysis for the LFR benchmark Y: Power-low distribution β: Community size μ : the fraction of edges between vertices belonging to different communities K: average degree of the graph X-axis: inner loop Y-axis: ε 22/ 36
- 23. Modularity gain threshold • We translate ε to the modularity gain threshold Δǭ by sorting the Δ𝑄 𝑢 (u ∈ V) for all the vertices and selecting the top ε vertices to update the community membership. • Once Δǭ is known, the vertices can update the community information in parallel. 23/ 36
- 25. Parallel Louvain Algorithm The algorithm starts by initializing all the data structures. At the very beginning, the In-Table contains all the in-edges information of the vertices owned by each node/process and the Out-Table is empty. The algorithm starts with STATE-PROPAGATION(), which globally propagates the community state information. Once all the messages have been delivered and hashed in place, Out Table is initialized. REFINE() corresponds to the inner loop of the sequential algorithm and orchestrates the vertex migration to different communities, based on the modularity gain. GRAPH-RECONSTRUCTION()): reconstruct the graph and prepare for the next round. The algorithm halts when there is no further modularity improvement. 25/ 36
- 26. Community State Propagation • All processes sequentially scan their In-Table and send messages to the destination processes that own the corresponding vertices (v in the algorithm) • The processes also receive messages from others and insert/update hashed edges to the Out-Table 26/ 36
- 27. Refine 27/ 36
- 28. Refine • It starts with initializing ĉ 𝑢and m 𝑢 • The Out-Table contains all the neighbors’ communities and the accumulated weights to the communities from the vertices owned by that process. • Each process starts scanning the hashed edges (u,c’) and then calculates the modularity gain of putting vertex u to community c’ and updates the corresponding ˆ ĉ 𝑢and m 𝑢if needed. After all the Out-Tables have been examined, each vertex has the information on the best community to join. • Then, updates the community information and the corresponding Σtot based on Δǭ , and computes the new modularity. • The procedure terminates when no further vertex movement can increase the modularity. 28/ 36
- 30. Graph Reconstruction • The weights of the edges between the new vertices are given by the sum of the weight of the edges between vertices in the corresponding two communities. • Edges between vertices of the same community lead to self-loops in the new graph. • Line 4,5 scan the Out-Table and send the aggregated edges information to the owner process of the vertex (community) • Upon receiving the messages, lines 7–11 finally prepare the In-Table as the input graph for the next round of execution. 30/ 36
- 32. Modularity • X-axis: outer loop iteration • Y-axis: modularity • The basic parallel version converges very slowly with a very low modularity score. • The parallel version with the proposed heuristic is on par with the original sequential algorithm and, surprisingly, in one case, UK- 2005, obtains slightly higher modularity. 32/ 36
- 33. Speedup • 49.8x speedup obtained with UK-2005 on 64 nodes, is 5.6 times faster than the result reported in previous paper, which examines a random sampling generated sub- network with 31% edges of the original UK-2005 graph. 33/ 36
- 35. Scalability • The parallel algorithm achieves good scalability. The processing rate is proportional to the number of nodes. 35/ 36
- 36. Conclusions • We have introduced a parallel version of the Louvain algorithm • we have introduced a novel hash-based strategy to efficiently store and process dynamic graphs • Our parallel algorithm preserves the same convergence, modularity and community properties as the original sequential Louvain algorithm • We have performed an extensive performance evaluation on a wide variety of real- world social graphs and synthetic graphs 36/ 36
- 37. Scalable Community Detection with the Louvain Algorithm Navid Sedighpour Dr. Alireza Bagheri Advanced Algorithm Presentation July 2016 Thank You