This document proposes a new communication paradigm called "implicit targeting" to improve the performance of distributed graph processing systems when computing PageRank on very large graphs. It allows partitions to exchange messages containing only payload data by relying on predetermined target ordering. Experiments on a web graph of 38 billion vertices and 3.1 trillion edges achieved PageRank computation times of 34.4 seconds per iteration, suggesting over an order of magnitude improvement over state-of-the-art systems.
Implementing Map Reduce Based Edmonds-Karp Algorithm to Determine Maximum Flo...paperpublications3
Abstract: Maximum-flow problem are used to find Google spam sites, discover Face book communities, etc., on graphs from the Internet. Such graphs are now so large that they have outgrown conventional memory-resident algorithms. In this paper, we show how to effectively parallelize a maximum flow problem based on the Edmonds-Karp Algorithm (EKA) method on a cluster using the MapReduce framework. Our algorithm exploits the property that such graphs are small-world networks with low diameter and employs optimizations to improve the effectiveness of MapReduce and increase parallelism. We are able to compute maximum flow on a subset of the a large network graph with approximately more number of vertices and more number of edges using a cluster of 4 or 5 machines in reasonable time.Keywords: Algorithm, MapReduce, Hadoop.
Title: Implementing Map Reduce Based Edmonds-Karp Algorithm to Determine Maximum Flow in Large Network Graph
Author: Dhananjaya Kumar K, Mr. Manjunatha A.S
International Journal of Recent Research in Mathematics Computer Science and Information Technology
ISSN 2350-1022
Paper Publications
As we see that the world has become closer and faster and with the enormous growth of distributed networks like p2p, social networks, overlay networks, cloud computing etc. Theses Distributed networks are represented as graphs and the fundamental component of distributed network is the relationship defined by linkages among units or nodes in the network. Major concern for computer experts is how to store such enormous amount of data especially in form of graphs. There is a need for efficient data structure used for storage of such type of data should provide efficient format for fast retrieval of data as and when required, in this types of networks. Although adjacency matrix is an effective technique to represent a graph having few or large number of nodes and vertices but when it comes to analysis of huge amount of data from site likes like face book or twitter, adjacency matrix cannot do this. In this paper, we study the existing application of a special kind of data structure, skip graph with its various versions which can be efficiently used for storing such type of data resulting in optimal storage, space utilization retrieval and concurrency.
OpenFlow is one of the most commonly used protocols for communication between the
controller and the forwarding element in a software defined network (SDN). A model based on
M/M/1 queues is proposed in [1] to capture the communication between the forwarding element
and the controller. Albeit the model provides useful insight, it is accurate only for the case when
the probability of expecting a new flow is small.
Secondly, it is not straight forward to extend the model in [1] to more than one forwarding
element in the data plane. In this work we propose a model which addresses both these
challenges. The model is based on Jackson assumption but with corrections tailored to the
OpenFlow based SDN network. Performance analysis using the proposed model indicates that
the model is accurate even for the case when the probability of new flow is quite large. Further
we show by a toy example that the model can be extended to more than one node in the data plane.
Presentation of Understanding and Surpassing Dropbox Globecom 2015Fajar Purnama
This is not my paper, just an assignment of the computer algorithm class I am taking to present a paper.
Title: Understanding and Surpassing Dropbox: Efficient
Incremental Synchronization in Cloud Storage Services
Authors: Shenglong Li, Quanlu Zhang, Zhi Yang, Yafei Dai
Source: http://dx.doi.org/10.1109/GLOCOM.2015.7417235
Presenter: Fajar Purnama
Video https://bit.tube/play?hash=QmSKeTyFcuKuRrTGMqLVHy43RXHrQgQkPoXhnH4MAMkf6K&channel=156033
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
In this paper we study the problem of dynamically
maintaining graph properties under batches of edge
insertions and deletions in the massively parallel model
of computation. In this setting, the graph is stored
on a number of machines, each having space strongly
sublinear with respect to the number of vertices, that
is, n
for some constant 0 < < 1. Our goal is to
handle batches of updates and queries where the data
for each batch fits onto one machine in constant rounds
of parallel computation, as well as to reduce the total
communication between the machines. This objective
corresponds to the gradual buildup of databases over
time, while the goal of obtaining constant rounds of
communication for problems in the static setting has
been elusive for problems as simple as undirected graph
connectivity.
We give an algorithm for dynamic graph connectivity
in this setting with constant communication rounds and
communication cost almost linear in terms of the batch
size. Our techniques combine a new graph contraction
technique, an independent random sample extractor from
correlated samples, as well as distributed data structures
supporting parallel updates and queries in batches.
We also illustrate the power of dynamic algorithms in
the MPC model by showing that the batched version
of the adaptive connectivity problem is P-complete in
the centralized setting, but sub-linear sized batches can
be handled in a constant number of rounds. Due to
the wide applicability of our approaches, we believe
it represents a practically-motivated workaround to the
current difficulties in designing more efficient massively
parallel static graph algorithms.
Implementing Map Reduce Based Edmonds-Karp Algorithm to Determine Maximum Flo...paperpublications3
Abstract: Maximum-flow problem are used to find Google spam sites, discover Face book communities, etc., on graphs from the Internet. Such graphs are now so large that they have outgrown conventional memory-resident algorithms. In this paper, we show how to effectively parallelize a maximum flow problem based on the Edmonds-Karp Algorithm (EKA) method on a cluster using the MapReduce framework. Our algorithm exploits the property that such graphs are small-world networks with low diameter and employs optimizations to improve the effectiveness of MapReduce and increase parallelism. We are able to compute maximum flow on a subset of the a large network graph with approximately more number of vertices and more number of edges using a cluster of 4 or 5 machines in reasonable time.Keywords: Algorithm, MapReduce, Hadoop.
Title: Implementing Map Reduce Based Edmonds-Karp Algorithm to Determine Maximum Flow in Large Network Graph
Author: Dhananjaya Kumar K, Mr. Manjunatha A.S
International Journal of Recent Research in Mathematics Computer Science and Information Technology
ISSN 2350-1022
Paper Publications
As we see that the world has become closer and faster and with the enormous growth of distributed networks like p2p, social networks, overlay networks, cloud computing etc. Theses Distributed networks are represented as graphs and the fundamental component of distributed network is the relationship defined by linkages among units or nodes in the network. Major concern for computer experts is how to store such enormous amount of data especially in form of graphs. There is a need for efficient data structure used for storage of such type of data should provide efficient format for fast retrieval of data as and when required, in this types of networks. Although adjacency matrix is an effective technique to represent a graph having few or large number of nodes and vertices but when it comes to analysis of huge amount of data from site likes like face book or twitter, adjacency matrix cannot do this. In this paper, we study the existing application of a special kind of data structure, skip graph with its various versions which can be efficiently used for storing such type of data resulting in optimal storage, space utilization retrieval and concurrency.
OpenFlow is one of the most commonly used protocols for communication between the
controller and the forwarding element in a software defined network (SDN). A model based on
M/M/1 queues is proposed in [1] to capture the communication between the forwarding element
and the controller. Albeit the model provides useful insight, it is accurate only for the case when
the probability of expecting a new flow is small.
Secondly, it is not straight forward to extend the model in [1] to more than one forwarding
element in the data plane. In this work we propose a model which addresses both these
challenges. The model is based on Jackson assumption but with corrections tailored to the
OpenFlow based SDN network. Performance analysis using the proposed model indicates that
the model is accurate even for the case when the probability of new flow is quite large. Further
we show by a toy example that the model can be extended to more than one node in the data plane.
Presentation of Understanding and Surpassing Dropbox Globecom 2015Fajar Purnama
This is not my paper, just an assignment of the computer algorithm class I am taking to present a paper.
Title: Understanding and Surpassing Dropbox: Efficient
Incremental Synchronization in Cloud Storage Services
Authors: Shenglong Li, Quanlu Zhang, Zhi Yang, Yafei Dai
Source: http://dx.doi.org/10.1109/GLOCOM.2015.7417235
Presenter: Fajar Purnama
Video https://bit.tube/play?hash=QmSKeTyFcuKuRrTGMqLVHy43RXHrQgQkPoXhnH4MAMkf6K&channel=156033
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
In this paper we study the problem of dynamically
maintaining graph properties under batches of edge
insertions and deletions in the massively parallel model
of computation. In this setting, the graph is stored
on a number of machines, each having space strongly
sublinear with respect to the number of vertices, that
is, n
for some constant 0 < < 1. Our goal is to
handle batches of updates and queries where the data
for each batch fits onto one machine in constant rounds
of parallel computation, as well as to reduce the total
communication between the machines. This objective
corresponds to the gradual buildup of databases over
time, while the goal of obtaining constant rounds of
communication for problems in the static setting has
been elusive for problems as simple as undirected graph
connectivity.
We give an algorithm for dynamic graph connectivity
in this setting with constant communication rounds and
communication cost almost linear in terms of the batch
size. Our techniques combine a new graph contraction
technique, an independent random sample extractor from
correlated samples, as well as distributed data structures
supporting parallel updates and queries in batches.
We also illustrate the power of dynamic algorithms in
the MPC model by showing that the batched version
of the adaptive connectivity problem is P-complete in
the centralized setting, but sub-linear sized batches can
be handled in a constant number of rounds. Due to
the wide applicability of our approaches, we believe
it represents a practically-motivated workaround to the
current difficulties in designing more efficient massively
parallel static graph algorithms.
Wireless data broadcast is an efficient way of disseminating data to users in the mobile computing environments. From the server’s point of view, how to place the data items on channels is a crucial issue, with the objective of minimizing the average access time and tuning time. Similarly, how to schedule the data retrieval process for a given request at the client side such that all the requested items can be downloaded in a short time is also an important problem. In this paper, we investigate the multi-item data retrieval scheduling in the push-based multichannel broadcast environments. The most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Cost Data Retrieval Problem (MCDR) and Large Number Data Retrieval (LNDR) Problem. We also develop a heuristic algorithm to download a large number of items efficiently. When there is no replicated item in a broadcast cycle, we show that an optimal retrieval schedule can be obtained in polynomial time
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingAlexander Schätzle
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration.
Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMScsandit
In an era where communication has a most important role in modern societies, designing efficient
algorithms for data transmission is of the outmost importance. TDMA is a technology used in many
communication systems such as satellites and cell phones. In order to transmit data in such systems we
need to cluster them in packages. To achieve a faster transmission we are allowed to preempt the
transmission of any packet in order to resume at a later time. Such preemptions though come with a delay
in order to setup for the next transmission. In this paper we propose an algorithm which yields improved
transmission scheduling. This algorithm we call MGA. We have proven an approximation ratio for MGA
and ran experiments to establish that it works even better in practice. In order to conclude that MGA will
be a very helpful tool in constructing an improved schedule for packet routing using preemtion with a setup
cost, we compare its results to two other efficient algorithms designed by researchers in the past.
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...csandit
Fault tolerance and data security are two important issues in modern communication systems.
In this paper, we propose a secure and efficient digital signature scheme with fault tolerance
based on the improved RSA system. The proposed scheme for the RSA cryptosystem contains
three prime numbers and overcome several attacks possible on RSA. By using the Chinese
Reminder Theorem (CRT) the proposed scheme has a speed improvement on the RSA decryption
side and it provides high security also.
On modeling controller switch interaction in openflow based sdnsIJCNCJournal
With an increase in number of software defined network (SDN) deployments,and OpenFlow consolidating as the protocol of choice for controller-switch interactions, a need to analytically model the system for performance analysis is increasing. An attempt has previously been made in [1] to model the syste considering both a controller and a switch as an M/M/1 queue. The method, although useful, lacks accuracy for higher probabilities of new flows entering the network. The approach is also deficient of
details on how it can be extended to more than one node in the data plane.These two short-comings are addressed in this paper where thecontroller and switch are modeled
collectively as Jackson’s network, with essential tuning to suit OpenFlow-based SDN. The consequent analysis shows the resilience of the model even for higher number of new flow entries. An example is also used
to illustrate the case of multiple nodes in the data plane.
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Laurens De Vocht
Searching for relationships between Linked Data resources is typically interpreted as a pathfinding problem: looking for chains of intermediary nodes (hops) forming the connection or bridge between these resources in a single dataset or across multiple datasets.
In many cases centralizing all needed linked data in a certain (specialized) repository or index to be able to run the algorithm is not possible or at least not desired. To address this, we propose an approach to top-k shortest pathfinding, which optimally translates a pathfinding query into se- quences of triple pattern fragment requests.
Triple Pattern Fragments were recently introduced as a solution to address the availability of data on theWeb and the scalability of linked data client applications, preventing data processing bottlenecks on the server.
The results are streamed to the client, thus allowing clients to do asynchronous processing of the top-k shortest paths.
We explain how this approach behaves using a training dataset, a subset of DBpedia with 10 million triples, and show the trade-offs to a SPARQL approach where all the data is gathered in a single triple store on a single machine.
Furthermore we investigate the scalability when increasing the size of the subset up to 110 million triples.
Software Package (NodeJS): npmjs.com/package/everything_is_connected_engine
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
M.E Computer Science Wireless Communication ProjectsVijay Karan
List of Wireless Communication IEEE 2006 Projects. It Contains the IEEE Projects in the Domain Wireless Communication for M.E Computer Science students.
SCALING DISTRIBUTED DATABASE JOINS BY DECOUPLING COMPUTATION AND COMMUNICATIONijdms
To process a large volume of data, modern data management systems use a collection of machines
connected through a network. This paper proposes frameworks and algorithms for processing
distributed joins—a compute- and communication-intensive workload in modern data-intensive systems.
By exploiting multiple processing cores within the individual machines, we implement a system to
process database joins that parallelizes computation within each node, pipelines the computation with
communication, parallelizes the communication by allowing multiple simultaneous data transfers
(send/receive). Our experimental results show that using only four threads per node the framework
achieves a 3.5x gains in intra-node performance while compared with a single-threaded counterpart.
Moreover, with the join processing workload the cluster-wide performance (and speedup) is observed to
be dictated by the intra-node computational loads; this property brings a near-linear speedup with
increasing nodes in the system, a feature much desired in modern large-scale data processing system
SCALING DISTRIBUTED DATABASE JOINS BY DECOUPLING COMPUTATION AND COMMUNICATIONijdms
To process a large volume of data, modern data management systems use a collection of machines
connected through a network. This paper proposes frameworks and algorithms for processing
distributed joins—a compute- and communication-intensive workload in modern data-intensive systems.
By exploiting multiple processing cores within the individual machines, we implement a system to
process database joins that parallelizes computation within each node, pipelines the computation with
communication, parallelizes the communication by allowing multiple simultaneous data transfers
(send/receive). Our experimental results show that using only four threads per node the framework
achieves a 3.5x gains in intra-node performance while compared with a single-threaded counterpart.
Moreover, with the join processing workload the cluster-wide performance (and speedup) is observed to
be dictated by the intra-node computational loads; this property brings a near-linear speedup with
increasing nodes in the system, a feature much desired in modern large-scale data processing system.
Wireless data broadcast is an efficient way of disseminating data to users in the mobile computing environments. From the server’s point of view, how to place the data items on channels is a crucial issue, with the objective of minimizing the average access time and tuning time. Similarly, how to schedule the data retrieval process for a given request at the client side such that all the requested items can be downloaded in a short time is also an important problem. In this paper, we investigate the multi-item data retrieval scheduling in the push-based multichannel broadcast environments. The most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Cost Data Retrieval Problem (MCDR) and Large Number Data Retrieval (LNDR) Problem. We also develop a heuristic algorithm to download a large number of items efficiently. When there is no replicated item in a broadcast cycle, we show that an optimal retrieval schedule can be obtained in polynomial time
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingAlexander Schätzle
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration.
Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMScsandit
In an era where communication has a most important role in modern societies, designing efficient
algorithms for data transmission is of the outmost importance. TDMA is a technology used in many
communication systems such as satellites and cell phones. In order to transmit data in such systems we
need to cluster them in packages. To achieve a faster transmission we are allowed to preempt the
transmission of any packet in order to resume at a later time. Such preemptions though come with a delay
in order to setup for the next transmission. In this paper we propose an algorithm which yields improved
transmission scheduling. This algorithm we call MGA. We have proven an approximation ratio for MGA
and ran experiments to establish that it works even better in practice. In order to conclude that MGA will
be a very helpful tool in constructing an improved schedule for packet routing using preemtion with a setup
cost, we compare its results to two other efficient algorithms designed by researchers in the past.
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...csandit
Fault tolerance and data security are two important issues in modern communication systems.
In this paper, we propose a secure and efficient digital signature scheme with fault tolerance
based on the improved RSA system. The proposed scheme for the RSA cryptosystem contains
three prime numbers and overcome several attacks possible on RSA. By using the Chinese
Reminder Theorem (CRT) the proposed scheme has a speed improvement on the RSA decryption
side and it provides high security also.
On modeling controller switch interaction in openflow based sdnsIJCNCJournal
With an increase in number of software defined network (SDN) deployments,and OpenFlow consolidating as the protocol of choice for controller-switch interactions, a need to analytically model the system for performance analysis is increasing. An attempt has previously been made in [1] to model the syste considering both a controller and a switch as an M/M/1 queue. The method, although useful, lacks accuracy for higher probabilities of new flows entering the network. The approach is also deficient of
details on how it can be extended to more than one node in the data plane.These two short-comings are addressed in this paper where thecontroller and switch are modeled
collectively as Jackson’s network, with essential tuning to suit OpenFlow-based SDN. The consequent analysis shows the resilience of the model even for higher number of new flow entries. An example is also used
to illustrate the case of multiple nodes in the data plane.
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Laurens De Vocht
Searching for relationships between Linked Data resources is typically interpreted as a pathfinding problem: looking for chains of intermediary nodes (hops) forming the connection or bridge between these resources in a single dataset or across multiple datasets.
In many cases centralizing all needed linked data in a certain (specialized) repository or index to be able to run the algorithm is not possible or at least not desired. To address this, we propose an approach to top-k shortest pathfinding, which optimally translates a pathfinding query into se- quences of triple pattern fragment requests.
Triple Pattern Fragments were recently introduced as a solution to address the availability of data on theWeb and the scalability of linked data client applications, preventing data processing bottlenecks on the server.
The results are streamed to the client, thus allowing clients to do asynchronous processing of the top-k shortest paths.
We explain how this approach behaves using a training dataset, a subset of DBpedia with 10 million triples, and show the trade-offs to a SPARQL approach where all the data is gathered in a single triple store on a single machine.
Furthermore we investigate the scalability when increasing the size of the subset up to 110 million triples.
Software Package (NodeJS): npmjs.com/package/everything_is_connected_engine
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
M.E Computer Science Wireless Communication ProjectsVijay Karan
List of Wireless Communication IEEE 2006 Projects. It Contains the IEEE Projects in the Domain Wireless Communication for M.E Computer Science students.
SCALING DISTRIBUTED DATABASE JOINS BY DECOUPLING COMPUTATION AND COMMUNICATIONijdms
To process a large volume of data, modern data management systems use a collection of machines
connected through a network. This paper proposes frameworks and algorithms for processing
distributed joins—a compute- and communication-intensive workload in modern data-intensive systems.
By exploiting multiple processing cores within the individual machines, we implement a system to
process database joins that parallelizes computation within each node, pipelines the computation with
communication, parallelizes the communication by allowing multiple simultaneous data transfers
(send/receive). Our experimental results show that using only four threads per node the framework
achieves a 3.5x gains in intra-node performance while compared with a single-threaded counterpart.
Moreover, with the join processing workload the cluster-wide performance (and speedup) is observed to
be dictated by the intra-node computational loads; this property brings a near-linear speedup with
increasing nodes in the system, a feature much desired in modern large-scale data processing system
SCALING DISTRIBUTED DATABASE JOINS BY DECOUPLING COMPUTATION AND COMMUNICATIONijdms
To process a large volume of data, modern data management systems use a collection of machines
connected through a network. This paper proposes frameworks and algorithms for processing
distributed joins—a compute- and communication-intensive workload in modern data-intensive systems.
By exploiting multiple processing cores within the individual machines, we implement a system to
process database joins that parallelizes computation within each node, pipelines the computation with
communication, parallelizes the communication by allowing multiple simultaneous data transfers
(send/receive). Our experimental results show that using only four threads per node the framework
achieves a 3.5x gains in intra-node performance while compared with a single-threaded counterpart.
Moreover, with the join processing workload the cluster-wide performance (and speedup) is observed to
be dictated by the intra-node computational loads; this property brings a near-linear speedup with
increasing nodes in the system, a feature much desired in modern large-scale data processing system.
SCALING DISTRIBUTED DATABASE JOINS BY DECOUPLING COMPUTATION AND COMMUNICATIONijdms
To process a large volume of data, modern data management systems use a collection of machines
connected through a network. This paper proposes frameworks and algorithms for processing
distributed joins—a compute- and communication-intensive workload in modern data-intensive systems.
By exploiting multiple processing cores within the individual machines, we implement a system to
process database joins that parallelizes computation within each node, pipelines the computation with
communication, parallelizes the communication by allowing multiple simultaneous data transfers
(send/receive). Our experimental results show that using only four threads per node the framework
achieves a 3.5x gains in intra-node performance while compared with a single-threaded counterpart.
Moreover, with the join processing workload the cluster-wide performance (and speedup) is observed to
be dictated by the intra-node computational loads; this property brings a near-linear speedup with
increasing nodes in the system, a feature much desired in modern large-scale data processing system.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
Design and Evaluation of Scheduling Architecture for IEEE 802.16 in Mobile Ad...ijtsrd
A worldwide demand for high speed broadband wireless systems across commercial and residential regions is emerging rapidly due to the increasing reliance on web for information, business, entertainment and new upcoming high bandwidth intensive or real time applications. The IEEE 802.16 standard which has emerged as Broadband Wireless Access BWA solution is promising to meet all such requirements and becoming the most popular way for wireless communication. The IEEE 802.16 advantages includes variable and high data rate, last mile wireless access, point to multipoint communication, large frequency range and QoS for various types of applications. We propose Weighted Fair Queue WFQ based MAC scheduling architecture for IEEE 802.16 in both uplink and downlink direction. Our scheduling architecture accommodates parameters like traffic priority, minimum reserved bandwidth and queue information for various applications. Comparing with the traditional fixed bandwidth allocation schemes, the proposed architecture incorporates the traffic scheduling mechanism based on weights of flows and provides more fairness to the system Meenakshi | Rashmi Raj ""Design and Evaluation of Scheduling Architecture for IEEE 802.16 in Mobile Ad-Hoc Network"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-4 , June 2019, URL: https://www.ijtsrd.com/papers/ijtsrd22845.pdf
Paper URL: https://www.ijtsrd.com/engineering/electronics-and-communication-engineering/22845/design-and-evaluation-of-scheduling-architecture-for-ieee-80216-in-mobile-ad-hoc-network/meenakshi
Massive parallelism with gpus for centrality ranking in complex networksijcsit
Many problems in Computer Science can be modelled using graphs. Evaluating node centrality in complex
networks, which can be considered equivalent to undirected graphs, provides an useful metric of the
relative importance of each node inside the evaluated network. The knowledge on which the most central
nodes are, has various applications, such as improving information spreading in diffusion networks. In this
case, most central nodes can be considered to have higher influence rates over other nodes in the network.
The main purpose in this work is developing a GPU based and massively parallel application so as to
evaluate the node centrality in complex networks using the Nvidia CUDA programming model. The main
contribution of this work is the strategies for the development of an algorithm to evaluate the node
centrality in complex networks using Nvidia CUDA parallel programming model. We show that the
strategies improves algorithm´s speed-up in two orders of magnitude on one NVIDIA Tesla k20 GPU
cluster node, when compared to the hybrid OpenMP/MPI algorithm version, running in the same cluster,
with 4 nodes 2 Intel(R) Xeon(R) CPU E5-2660 each, for radius zero
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...Subhajit Sahu
Authors:
Paritosh Garg
Kishore Kothapalli
Publication:
ICDCN '16: Proceedings of the 17th International Conference on Distributed Computing and Networking. January 2016.
Article No.: 15 Pages 1–10
https://doi.org/10.1145/2833312.2833322
Centrality-Based Network Coder Placement For Peer-To-Peer Content DistributionIJCNCJournal
Network coding has been shown to achieve optimal multicast throughput, yet at an expensive computation
cost: every node in the network has to code. Interested in minimizing resource consumption of network
coding while maintaining its performance, in this paper, we propose a practical network coder placement
algorithm which achieves comparable content distribution time as network coding, and at the same time,
substantially reduces the number of network coders compared to a full network coding solution in which all
peers have to encode, i.e. become encoders. Our algorithm is derived from two key elements. First, it is
based on the insight that coding at upstream peers eliminates information duplication to downstream peers,
which results in efficient content distribution. Second, our placement strategy exploits centrality
characteristics of the network topology to quickly determine key positions to place encoders. Performance
evaluation using various topology and algorithm parameters confirms the effectiveness of our proposed
method.
International Refereed Journal of Engineering and Science (IRJES) is a peer reviewed online journal for professionals and researchers in the field of computer science. The main aim is to resolve emerging and outstanding problems revealed by recent social and technological change. IJRES provides the platform for the researchers to present and evaluate their work from both theoretical and technical aspects and to share their views.
International Refereed Journal of Engineering and Science (IRJES)irjes
International Refereed Journal of Engineering and Science (IRJES) is a leading international journal for publication of new ideas, the state of the art research results and fundamental advances in all aspects of Engineering and Science. IRJES is a open access, peer reviewed international journal with a primary objective to provide the academic community and industry for the submission of half of original research and applications
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...Subhajit Sahu
TrueTime is a service that enables the use of globally synchronized clocks, with bounded error. It returns a time interval that is guaranteed to contain the clock’s actual time for some time during the call’s execution. If two intervals do not overlap, then we know calls were definitely ordered in real time. In general, synchronized clocks can be used to avoid communication in a distributed system.
The underlying source of time is a combination of GPS receivers and atomic clocks. As there are “time masters” in every datacenter (redundantly), it is likely that both sides of a partition would continue to enjoy accurate time. Individual nodes however need network connectivity to the masters, and without it their clocks will drift. Thus, during a partition their intervals slowly grow wider over time, based on bounds on the rate of local clock drift. Operations depending on TrueTime, such as Paxos leader election or transaction commits, thus have to wait a little longer, but the operation still completes (assuming the 2PC and quorum communication are working).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting Bitset for graph : SHORT REPORT / NOTESSubhajit Sahu
Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is commonly used for efficient graph computations. Unfortunately, using CSR for dynamic graphs is impractical since addition/deletion of a single edge can require on average (N+M)/2 memory accesses, in order to update source-offsets and destination-indices. A common approach is therefore to store edge-lists/destination-indices as an array of arrays, where each edge-list is an array belonging to a vertex. While this is good enough for small graphs, it quickly becomes a bottleneck for large graphs. What causes this bottleneck depends on whether the edge-lists are sorted or unsorted. If they are sorted, checking for an edge requires about log(E) memory accesses, but adding an edge on average requires E/2 accesses, where E is the number of edges of a given vertex. Note that both addition and deletion of edges in a dynamic graph require checking for an existing edge, before adding or deleting it. If edge lists are unsorted, checking for an edge requires around E/2 memory accesses, but adding an edge requires only 1 memory access.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Experiments with Primitive operations : SHORT REPORT / NOTESSubhajit Sahu
This includes:
- Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
- Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
- Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
- Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
Below are the important points I note from the 2020 paper by Martin Grohe:
- 1-WL distinguishes almost all graphs, in a probabilistic sense
- Classical WL is two dimensional Weisfeiler-Leman
- DeepWL is an unlimited version of WL graph that runs in polynomial time.
- Knowledge graphs are essentially graphs with vertex/edge attributes
ABSTRACT:
Vector representations of graphs and relational structures, whether handcrafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view.
Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
https://gist.github.com/wolfram77/54c4a14d9ea547183c6c7b3518bf9cd1
There exist a number of dynamic graph generators. Barbasi-Albert model iteratively attach new vertices to pre-exsiting vertices in the graph using preferential attachment (edges to high degree vertices are more likely - rich get richer - Pareto principle). However, graph size increases monotonically, and density of graph keeps increasing (sparsity decreasing).
Gorke's model uses a defined clustering to uniformly add vertices and edges. Purohit's model uses motifs (eg. triangles) to mimick properties of existing dynamic graphs, such as growth rate, structure, and degree distribution. Kronecker graph generators are used to increase size of a given graph, with power-law distribution.
To generate dynamic graphs, we must choose a metric to compare two graphs. Common metrics include diameter, clustering coefficient (modularity?), triangle counting (triangle density?), and degree distribution.
In this paper, the authors propose Dygraph, a dynamic graph generator that uses degree distribution as the only metric. The authors observe that many real-world graphs differ from the power-law distribution at the tail end. To address this issue, they propose binning, where the vertices beyond a certain degree (minDeg = min(deg) s.t. |V(deg)| < H, where H~10 is the number of vertices with a given degree below which are binned) are grouped into bins of degree-width binWidth, max-degree localMax, and number of degrees in bin with at least one vertex binSize (to keep track of sparsity). This helps the authors to generate graphs with a more realistic degree distribution.
The process of generating a dynamic graph is as follows. First the difference between the desired and the current degree distribution is calculated. The authors then create an edge-addition set where each vertex is present as many times as the number of additional incident edges it must recieve. Edges are then created by connecting two vertices randomly from this set, and removing both from the set once connected. Currently, authors reject self-loops and duplicate edges. Removal of edges is done in a similar fashion.
Authors observe that adding edges with power-law properties dominates the execution time, and consider parallelizing DyGraph as part of future work.
My notes on shared memory parallelism.
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Using memory for communication inside a single program, e.g. among its multiple threads, is also referred to as shared memory [REF].
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESSubhajit Sahu
**Community detection methods** can be *global* or *local*. **Global community detection methods** divide the entire graph into groups. Existing global algorithms include:
- Random walk methods
- Spectral partitioning
- Label propagation
- Greedy agglomerative and divisive algorithms
- Clique percolation
https://gist.github.com/wolfram77/b4316609265b5b9f88027bbc491f80b6
There is a growing body of work in *detecting overlapping communities*. **Seed set expansion** is a **local community detection method** where a relevant *seed vertices* of interest are picked and *expanded to form communities* surrounding them. The quality of each community is measured using a *fitness function*.
**Modularity** is a *fitness function* which compares the number of intra-community edges to the expected number in a random-null model. **Conductance** is another popular fitness score that measures the community cut or inter-community edges. Many *overlapping community detection* methods **use a modified ratio** of intra-community edges to all edges with atleast one endpoint in the community.
Andersen et al. use a **Spectral PageRank-Nibble method** which minimizes conductance and is formed by adding vertices in order of decreasing PageRank values. Andersen and Lang develop a **random walk approach** in which some vertices in the seed set may not be placed in the final community. Clauset gives a **greedy method** that *starts from a single vertex* and then iteratively adds neighboring vertices *maximizing the local modularity score*. Riedy et al. **expand multiple vertices** via maximizing modularity.
Several algorithms for **detecting global, overlapping communities** use a *greedy*, *agglomerative approach* and run *multiple separate seed set expansions*. Lancichinetti et al. run **greedy seed set expansions**, each with a *single seed vertex*. Overlapping communities are produced by a sequentially running expansions from a node not yet in a community. Lee et al. use **maximal cliques as seed sets**. Havemann et al. **greedily expand cliques**.
The authors of this paper discuss a dynamic approach for **community detection using seed set expansion**. Simply marking the neighbours of changed vertices is a **naive approach**, and has *severe shortcomings*. This is because *communities can split apart*. The simple updating method *may fail even when it outputs a valid community* in the graph.
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESSubhajit Sahu
A **community** (in a network) is a subset of nodes which are _strongly connected among themselves_, but _weakly connected to others_. Neither the number of output communities nor their size distribution is known a priori. Community detection methods can be divisive or agglomerative. **Divisive methods** use _betweeness centrality_ to **identify and remove bridges** between communities. **Agglomerative methods** greedily **merge two communities** that provide maximum gain in _modularity_. Newman and Girvan have introduced the **modularity metric**. The problem of community detection is then reduced to the problem of modularity maximization which is **NP-complete**. **Louvain method** is a variant of the _agglomerative strategy_, in that is a _multi-level heuristic_.
https://gist.github.com/wolfram77/917a1a4a429e89a0f2a1911cea56314d
In this paper, the authors discuss **four heuristics** for Community detection using the _Louvain algorithm_ implemented upon recently developed **Grappolo**, which is a parallel variant of the Louvain algorithm. They are:
- Vertex following and Minimum label
- Data caching
- Graph coloring
- Threshold scaling
With the **Vertex following** heuristic, the _input is preprocessed_ and all single-degree vertices are merged with their corresponding neighbours. This helps reduce the number of vertices considered in each iteration, and also help initial seeds of communities to be formed. With the **Minimum label heuristic**, when a vertex is making the decision to move to a community and multiple communities provided the same modularity gain, the community with the smallest id is chosen. This helps _minimize or prevent community swaps_. With the **Data caching** heuristic, community information is stored in a vector instead of a map, and is reused in each iteration, but with some additional cost. With the **Vertex ordering via Graph coloring** heuristic, _distance-k coloring_ of graphs is performed in order to group vertices into colors. Then, each set of vertices (by color) is processed _concurrently_, and synchronization is performed after that. This enables us to mimic the behaviour of the serial algorithm. Finally, with the **Threshold scaling** heuristic, _successively smaller values of modularity threshold_ are used as the algorithm progresses. This allows the algorithm to converge faster, and it has been observed a good modularity score as well.
From the results, it appears that _graph coloring_ and _threshold scaling_ heuristics do not always provide a speedup and this depends upon the nature of the graph. It would be interesting to compare the heuristics against baseline approaches. Future work can include _distributed memory implementations_, and _community detection on streaming graphs_.
Application Areas of Community Detection: A Review : NOTESSubhajit Sahu
This is a short review of Community detection methods (on graphs), and their applications. A **community** is a subset of a network whose members are *highly connected*, but *loosely connected* to others outside their community. Different community detection methods *can return differing communities* these algorithms are **heuristic-based**. **Dynamic community detection** involves tracking the *evolution of community structure* over time.
https://gist.github.com/wolfram77/09e64d6ba3ef080db5558feb2d32fdc0
Communities can be of the following **types**:
- Disjoint
- Overlapping
- Hierarchical
- Local.
The following **static** community detection **methods** exist:
- Spectral-based
- Statistical inference
- Optimization
- Dynamics-based
The following **dynamic** community detection **methods** exist:
- Independent community detection and matching
- Dependent community detection (evolutionary)
- Simultaneous community detection on all snapshots
- Dynamic community detection on temporal networks
**Applications** of community detection include:
- Criminal identification
- Fraud detection
- Criminal activities detection
- Bot detection
- Dynamics of epidemic spreading (dynamic)
- Cancer/tumor detection
- Tissue/organ detection
- Evolution of influence (dynamic)
- Astroturfing
- Customer segmentation
- Recommendation systems
- Social network analysis (both)
- Network summarization
- Privary, group segmentation
- Link prediction (both)
- Community evolution prediction (dynamic, hot field)
<br>
<br>
## References
- [Application Areas of Community Detection: A Review : PAPER](https://ieeexplore.ieee.org/document/8625349)
This paper discusses a GPU implementation of the Louvain community detection algorithm. Louvain algorithm obtains hierachical communities as a dendrogram through modularity optimization. Given an undirected weighted graph, all vertices are first considered to be their own communities. In the first phase, each vertex greedily decides to move to the community of one of its neighbours which gives greatest increase in modularity. If moving to no neighbour's community leads to an increase in modularity, the vertex chooses to stay with its own community. This is done sequentially for all the vertices. If the total change in modularity is more than a certain threshold, this phase is repeated. Once this local moving phase is complete, all vertices have formed their first hierarchy of communities. The next phase is called the aggregation phase, where all the vertices belonging to a community are collapsed into a single super-vertex, such that edges between communities are represented as edges between respective super-vertices (edge weights are combined), and edges within each community are represented as self-loops in respective super-vertices (again, edge weights are combined). Together, the local moving and the aggregation phases constitute a stage. This super-vertex graph is then used as input fof the next stage. This process continues until the increase in modularity is below a certain threshold. As a result from each stage, we have a hierarchy of community memberships for each vertex as a dendrogram.
Approaches to perform the Louvain algorithm can be divided into coarse-grained and fine-grained. Coarse-grained approaches process a set of vertices in parallel, while fine-grained approaches process all vertices in parallel. A coarse-grained hybrid-GPU algorithm using multi GPUs has be implemented by Cheong et al. which grabbed my attention. In addition, their algorithm does not use hashing for the local moving phase, but instead sorts each neighbour list based on the community id of each vertex.
https://gist.github.com/wolfram77/7e72c9b8c18c18ab908ae76262099329
Survey for extra-child-process package : NOTESSubhajit Sahu
Useful additions to inbuilt child_process module.
📦 Node.js, 📜 Files, 📰 Docs.
Please see attached PDF for literature survey.
https://gist.github.com/wolfram77/d936da570d7bf73f95d1513d4368573e
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERSubhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/692d263f463fd49be6eb5aa65dd4d0f9
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Subhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024
Fast Incremental Community Detection on Dynamic Graphs : NOTESSubhajit Sahu
In this paper, the authors describe two approaches for dynamic community detection using the CNM algorithm. CNM is a hierarchical, agglomerative algorithm that greedily maximizes modularity. They define two approaches: BasicDyn and FastDyn. BasicDyn backtracks merges of communities until each marked (changed) vertex is its own singleton community. FastDyn undoes a merge only if the quality of merge, as measured by the induced change in modularity, has significantly decreased compared to when the merge initially took place. FastDyn also allows more than two vertices to contract together if in the previous time step these vertices eventually ended up contracted in the same community. In the static case, merging several vertices together in one contraction phase could lead to deteriorating results. FastDyn is able to do this, however, because it uses information from the merges of the previous time step. Intuitively, merges that previously occurred are more likely to be acceptable later.
https://gist.github.com/wolfram77/1856b108334cc822cdddfdfa7334792a
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
2. WWW ’20, April 20–24, 2020, Taipei, Taiwan Stergios Stergiou
K1 K2 K3 K4 K5
V1 V2 V3 V4 V5
K1 K2 K4
V1 V2 V4
K3 K5
V3 V5
edges
P3
P4
buffers
P1
K6 K7 K8 K9 K10
V6 V7 V8 V9 V10
K7 K8
V7 V8
K6 K9 K10
V6 V9 V10
edges
P3
P4
buffers
P2
L1 L2 L4 L7 L8
V1 V2 V4 V7 V8
L1 L2 L4
L7 L8
messages
P1
P2
orders
P3
L6 L9 L10 L3 L5
V6 V9 V10 V3 V5
L3 L5
L6 L9 L10
messages
P1
P2
orders
P4
P1 → P3
V1 V2 V4
P1 → P4
V3 V5
P2 → P3
V7 V8
P2 → P4
V6 V9 V10
P2 → P3
V7 V8
P1 → P3
V1 V2 V4
P2 → P4
V6 V9 V10
P1 → P4
V3 V5
(i)
(i)
(ii)
(ii)
(iii)
(iii)
(iv)
(iv)
Figure 1: Implicit Targeting. Edge target identifiers are labeled as Ki . Edge payloads are labeled as Vi . Partitions P1 and P2 are
sending messages to partitions P3 and P4. (i) each partition iterates over the edges assigned to it and buffers messages per
target partition; (ii) buffered messages are sent to their respective target partitions; (iii) packets arrive at target partitions; (iv)
incoming messages containing payloads are paired with pre-computed target vertex identifiers (labeled as Lj ) and delivered
to the application. Lj are local vertex identifiers, only known to their partitions, corresponding to the global Kj identifiers.
P1 P3 P1 P3
P2 P4 P2 P4
(a) (b)
v v
+
+
Figure 2: Source-side Aggregation. Partitions P1, P2 and P4
contain 5, 3 and 6 edges respectively towards vertexv in par-
tition P3. (a) aggregation is not performed; (b) source-side
aggregation is performed with frequency threshold equal to
4. Messages from partitions P1 and P4 are aggregated, while
messages from partition P2 are sent directly to v as they are
below the frequency threshold.
a small subset of target vertices in each partition for which we
perform source-side payload aggregation. Specifically, we deduce a
set of target ids per partition whose elements are the target of at
least fT H edges in the partition, where fT H is a system-generated
local frequency threshold (see Figure 2). This set includes all large
in-degree vertices w.h.p., is small compared to the number of unique
target ids, and is obtained via a lightweight streaming algorithm
that requires memory only proportional to the size of the set.
We implemented our new communication paradigm, aptly named
implicit targeting (as target ids are implicit in messages), on Hronos,
a Yahoo! graph processing framework that has previously been used
for ML [31] [18] and other [30] distributed workloads. PageRank
[26], the classic web graph algorithm that simulates the surfing be-
havior of a random user [22], has been extensively used for evaluat-
ing the performance of graph processing frameworks. We evaluated
PageRank on graphs with up to 96.3 billion vertices. Specifically, for
a graph of ∼38 billion vertices and ∼3.1 trillion edges, we obtained
execution times of 34.4 seconds per iteration. We also compared our
system against Giraph on smaller public graphs and demonstrated
a 30X mean speedup.
The rest of the paper is organized as follows. In Section 2, we
briefly describe PageRank and the online algorithm used to detect
frequent elements. In Section 3 we describe our implicit targeting
communication pattern. In Section 4 we analyze the performance
of our source-side message aggregation algorithm. In Section 5 we
report experimental results.
2 PRELIMINARIES
2.1 PageRank
Let C be the set of pages in a web corpus, d(v) be the number
of outlinks of page v and W be a column stochastic matrix that
represents the connections between pages such that W [u,v] =
1/d(v) if there exists a link from page v to page u and 0 otherwise.
PageRank imposes a total order on C assuming ties are broken
arbitrarily. It denotes the probability that a user (called the random
surfer) would end up at a particular page if they started from an
arbitrary page and either visited one of the outlinks from that page,
or with a small probability α jumped to a pre-specified set of pages
(named the teleportation set T). For pages without any outlinks the
user would jump directly to a page in T.
For each graph vertexv we maintain the current iteration PageR-
ank (rank(v)) and the next iteration PageRank (next_rank(v)) val-
ues. rank(v) is initialized to either 1.0 or to a value computed at a
previous sync point. In the main computation phase, before process-
ing any edges, all local next_rank values are copied into rank, just
before they are initialized to 0. Each process iterates over all edges
present in its graph partition. For each edge (u,v) it sends a ∆PR
message to target vertex v, where ∆PR = (1 − α) · rank(u)/d(u). It
also adds α · rank(u) to a local teleportation variable teleportation.
If u has no outlinks, it adds the complete rank(u) to teleportation.
Concurrently, upon reception of a message the process adds the
3. Scaling PageRank to 100 Billion Pages WWW ’20, April 20–24, 2020, Taipei, Taiwan
Algorithm 1 FrequentItems< H >
1: function Process(value)
2: if value ∈ items.keys() then
3: items[value] ← items[value] + 1;
4: else if items.size() < H then
5: items[value] = 1;
6: else
7: for item ∈ items.keys() do
8: items[item] ← items[item] − 1
9: if items[item] = 0 then
10: delete items[item];
corresponding ∆PR to the next_rank(v) of the graph vertex v as-
sociated with the message. If v is not one of the vertices that
the process is maintaining, then it redirects the incoming ∆PR to
teleportation. This can happen if edge (u,v) points to an uncrawled
page. At the end of the iteration, the master process collects all local
teleportation values from its peers, aggregates them, and distributes
the sum to its peers and itself.
2.2 Frequent Items
Following, we present an algorithm that detects all target ids of
frequency f > n/(H + 1), of which there can be at most H, where n
is the number of all target ids in the partition. Let us first describe
the algorithm via an example for the base case of H = 1 [8]. Given
an array A of n elements, we would like to detect the majority ele-
ment if one exists. That is, we would like to detect an element that
appears more than n/2 times in the array. The algorithm works
in a streaming mode: it does not know n in advance and only ex-
amines each element once. It maintains a counter and a candidate
variable. Initially, counter = 1 and candidate = A[1]. It examines
each of the array elements A[2], . . . ,A[n]. If the current element is
the same as the one in candidate, then counter is incremented. Oth-
erwise, counter is decremented. If counter becomes negative, then
candidate is set to the current element and counter is reset to 1. In
the end, if a majority element exists, it will be present in candidate.
Intuitively, this holds because the number of all non-majority ele-
ments in A is smaller than the number of times the majority element
appears in A and therefore they are not enough to displace it from
candidate after all elements have been processed. The algorithm
is generalized in a straightforward manner for arbitrary H [13].
It maintains a map (items) between each candidate and its corre-
sponding counter. It has O(n) amortized time complexity and O(H)
space complexity. It is depicted on Algorithm 1.
3 IMPLICIT TARGETING
Implicit targeting communication assumes that the target partition
understands how to deliver messages to a target vertex without
actually receiving the graph vertex identifier along with the mes-
sage it corresponds to. The underlying idea is the following: During
each superstep, a target partition j receives sets of messages from
its peers without any specific order of arrival. However, sets of mes-
sages arriving from a particular source partition i will be delivered
in-order to j. We guarantee this by adopting unidirectional message
queues for point-to-point communication between i and j. So long
as the source partition i iterates over its edges in the same order,
the subset of the edges corresponding to target partition j will also
be processed in the same order and therefore, the corresponding
messages will be sent out, and be received by partition j in the same
order.
3.1 Vertex Identifiers
For large graphs, we can see that identifiers that are smaller than
128-bits are impractical via a birthday paradox argument: Let H be
the number of unique values of a vertex id and q(n, H) = 1−p(n, H)
be the probability that no two values are the same in a set of n
values randomly drawn from [1..H] with repetition. Then
q(n, H) = 1 ·
1 −
1
H
·
1 −
2
H
· · ·
1 −
n − 1
H
≈ e−1/H
· e−2/H
· · ·e−(n−1)/H
≈ e−n2/(2H)
Solving for n, we obtain the size of graph n(p, H) for which there
will be collisions between its vertex identifiers with probability at
least p:
n(p, H) ≈
s
2H ln
1
1 − p
Assuming 64-bit vertex identifiers and p = 10−6, there will be
collisions between ids with graphs of as few as 6.1 million vertices.
Arguably, such graphs are not the target of a large-scale graph
framework. Moreover, for public graphs such as the Common Crawl
Graph (see Table 1), the collision probability is 26%. On the other
hand, 128-bit ids allow for graphs with 820 billion vertices before
collisions for p = 10−15.
3.2 Target Learning Phase
In order for each partition to learn the graph’s vertex identifiers,
a target-learning phase is executed during graph loading: After a
partition has loaded its subset of the graph, it knows the partition
out-degrees for each of its peer partitions, i.e. the number of edges
that point to graph vertices in each of the other partitions. In the first
part of the target-learning phase, partitions also collect partition
in-degrees, i.e. the number of edges they should be expecting from
each peer partition during each superstep. In the second part of the
target-learning phase, each partition scans its subset of the graph
and sends out messages that contain the edges’ primary target
identifiers. The partition can quickly compute the target for each
edge as it knows the exact range of identifiers each partition is
responsible for. At the same time, each partition receives sets of
messages from its peers. For each message, it obtains the mapping
from the primary vertex identifier to the local vertex identifier,
which it then places at the end of the incoming order vector for
the corresponding source partition. After sending out all messages,
each partition sends termination messages to each peer.
3.3 Partition Learning Phase
Being able to send payloads without target ids significantly reduces
communication requirements and surfaces a new opportunity for
optimization. First, we note that it is no longer necessary for each
source partition to know the primary vertex identifiers of its edges’
4. WWW ’20, April 20–24, 2020, Taipei, Taiwan Stergios Stergiou
targets. This information would be necessary for a traditional com-
munication pattern, as the target partition needs to associate the
primary id to a local id in order to process each payload appropri-
ately. However, in the implicit targeting communication pattern, it
is sufficient for the source partition to know just the target partition
id, and for the target partition to know the originating partition for
each set of incoming payloads. Therefore, each partition iterates
over its edges, deduces their target partitions, and stores only target
partition identifiers for each of the edges. This typically reduces
the footprint for each partition’s edges eight-fold (since 128-bit
identifiers for vertices are replaced with 16-bit identifiers for the
partitions they belong to) and speeds up iterating over a partition’s
edges. The partition learning phase occurs concurrently with the
target learning phase. We note that the time required to establish
the implicit targeting communication pattern is always less than
the execution time of a single superstep that iterates over all graph
edges in a traditional communication pattern, and is therefore very
efficient.
4 MESSAGE AGGREGATION
Maintaining in-order vectors per partition introduces another chal-
lenge to the system: As each target partition may access its K − 1
incoming-message orders non-sequentially, it would be inefficient
to store them on secondary storage and are therefore maintained
in memory, which is a limited resource. While graph partitions are
typically well-balanced, inlinks per partition are not. In fact, we
have observed inlink count imbalances of five-fold or more. One
way to address this challenge would be to rebalance the graph as
a preprocessing step. However, graph partitioning algorithms are
expensive to execute on large-scale graphs, so that any benefit is
typically not enough to offset the delay they introduce to the overall
execution time except for very long-running computations. An al-
ternative to graph rebalancing would be to combine messages with
the same target primary identifier on the source side. This approach
would indeed address the inlinks imbalance. However, in order to
support source-side aggregation, one needs to maintain a structure
that is O(pu ), where pu is the number of unique edge targets in
the partition. Hash-based graph partitioning implies that this is
pu = O(ps ), where ps is the number of edge targets in the partition.
Therefore, this approach simply moves the memory bottleneck to a
different component of the system.
A natural way to refine this approach would be to selectively
perform source-side aggregation only for the very frequent target
primary ids (target ids with high inlink counts within the partition).
This optimization indeed removes the memory overhead of source-
side aggregation. Nevertheless it introduces another challenge: It
is not possible to deterministically compute the frequency of the
elements in an array of size n unless Ω(n) memory is used [23].
Although this would be a one-time operation, it is still inefficient for
very large graphs. We observe the following: We do not really need
to know the exact frequency of all target ids in each partition. We
only need to identify those that cross a given frequency threshold.
This greatly simplifies the time and space complexity of the problem
and allows us to adopt the algorithm in Section 2.2. In this context,
n is equal to ps .
The algorithm is executed while loading the graph partition. Each
primary target identifier is examined via Process() (Algorithm 1).
It is essentially executed at no cost, as graph loading is heavily
I/O bound and the processing of each vertex is pipelined with the
loading of the next vertex. We note that the elements in the set FI
of the resulting H target identifiers are not guaranteed to appear
more than n/(H + 1) times. Rather, it is only guaranteed that the
target identifiers whose frequency is larger than n/(H + 1) will be
part of the resulting set of H ids.
Let us now examine the performance of a system that adopts
source-side aggregation. Let v be a vertex with large in-degree
dv , K be the number of partitions and µ = dv /K. Conditioned on
v, the sources of all edges (si,v) are evenly distributed across all
partitions K. Let us consider applying aggregation on a specific
partition with frequency f = µ. Then the probability that v is in FI
is 50%. As a result, we expect dv /2 + K messages to be sent across
all partitions corresponding to the (si,v) edges. This approach is
very inefficient, as it only discards half of the messages towards
vertex v. Fortunately, we will subsequently show that almost no
edge escapes source-side aggregation when f = µ/2.
Lemma 4.1. The probability that a specific partition P has at most
k edges towards vertex v with in-degree dv is:
k
Õ
i=0
µK
i
1
K
i
1 −
1
K
µK−i
(1)
Proof. Let Xi be the indicator variable that denotes whether
the i-th edge has been placed in P. Then Xi is a Bernoulli RV
with success probability 1/K. Therefore
Í
i Xi follows the binomial
distribution characterized by dv trials and 1/K success probability.
□
We will use Hoeffding’s inequality to bound Eq. (1).
Theorem 4.2 (Hoeffding [17]). Let Xi be dv IID
Bernoulli RVs with success probability 1/K. Then it holds that:
P
Õ
i
Xi ≤ µ − ϵdv
!
≤ e−2ϵ2dv (2)
We can now bound the probability that a specific partition does
not apply source-side aggregation to the edges it is responsible for
that are targeting vertex v.
Theorem 4.3. The probability that a specific partition P has at
most µ/2 edges towards vertex v is bounded by e−µ/2K .
Proof. Let ϵ = 1/2K. Then from Eq. (2) we obtain:
P
Õ
i
Xi ≤ µ/2
!
≤ e−2ϵ2dv = e−µ/2K
.
□
We can now compute the expected number of messages that will
be transmitted towards vertex v from all partitions K.
Theorem 4.4. The expected number of messages that are targeting
vertex v is at most:
K +
µK
2
e−µ/2K
(3)
5. Scaling PageRank to 100 Billion Pages WWW ’20, April 20–24, 2020, Taipei, Taiwan
Table 1: Public Datasets
Dataset Vertices Edges
Common Crawl [1, 3] 3,315,452,339 128,736,952,320
WebUK 2006-2007 [7] 133,633,040 5,507,679,822
Twitter [2] 52,579,682 1,963,263,821
Table 2: Web Graph Properties
Property Value
URLs 37.73 Billion
Total Outlinks 3.076 Trillion
Graph Storage Footprint 99.2 TB
Proof. There are at most K partitions on which source-side
aggregation can be applied, therefore at most K messages will be
emitted as a result of such aggregations. On the partitions where
source-side aggregation is not applied, at most µ/2 edges will be
targeting vertex v. By linearity of expectation, at most
µK
2 e−µ/2K
non-aggregated edges will reach vertex v. □
By Eq. 3 we see that this simple, coordination-free scheme applies
an exponentially small dampening factor to dv . In Section 5, we
will see in practice that source-side aggregation almost completely
eliminates inlink imbalance even for small values of H.
5 EVALUATION
We explore the performance of PageRank on Hronos and compare
it with Giraph [12]. Giraph is a production-hardened framework
that has been shown to scale to 1 trillion edge graphs [12]. A recent
comparison between Pregel+, GPS, Giraph and GraphLab on the
WebUK[7] and Twitter[2] datasets for PageRank can be found in
[34]. We provide results for publicly available graphs (Table 1) and
large proprietary web graphs (Table 2). We collected results on a
3,000-node cluster of the following node configuration: 64GB RAM,
2x Intel Xeon E5-2620, 10Gbps Ethernet.
5.1 Public Datasets
We first provide results for Hronos and Giraph on the public datasets
(see Table 3). We observe mean speedups of more than 15X for 64
threads, and more than 30X for 5,000 threads. We observe that Gi-
raph appears to favor fewer workers, each of which using multiple
threads, rather than single-thread workers. For each experiment,
we show the best Workers×Threads combination that we could
find via a parameter sweep. Better results could not be obtained
for any of the experiments even when we allowed for up to 40,000
threads. For all Giraph experiments, workers were given the full
memory of the nodes.
5.2 Web Graphs
We provide results on a web graph of 37.7 billion vertices and
3.076 trillion edges (see Table 2). Loading and pre-processing of
the graph partitions consumes approximately 9’ for the slowest
mapper. Service discovery requires negligible time using a single-
node ZooKeeper. We first examine how well balanced graph splits
Figure 3: Absolute execution times for a single PageRank it-
eration. Results shown a web graph of 37.7 billion vertices
and 3076 billion edges for explicit pattern (graph on disk),
explicit pattern (graph in memory), explicit pattern (graph
in memory; H = 30000), implicit pattern (graph on disk;
H = 10000), implicit pattern (graph in memory; H = 10000).
Figure 4: Maximum / mean memory required vs Frequent
Items H. Memory imbalance is 19.68% for H = 30000.
Figure 5: Relative execution times for implicit pattern for
Frequent Items H. For H = 30000 execution time is 96.1% com-
pared to the case where H = 0. Overall messages sent are 97%
of the total using source-side aggregation with H = 30000.
6. WWW ’20, April 20–24, 2020, Taipei, Taiwan Stergios Stergiou
Table 3: Hronos Performance for a single PageRank Iteration
on Public Datasets
Dataset
Hronos
P=64 P=384 P=5000
Common Crawl 2012 128.19 7.63 2.45
WebUK 2006 3.64 0.53 0.1
Twitter 1.06 0.21 0.1
Giraph
P=64 P=384 P=5000
Common Crawl 2012 - - 67.51
WebUK 2006 51.6 9.54 9.50
Twitter 17.57 6.05 9.21
Hronos Speedup over Giraph
P=64 P=384 P=5000
Common Crawl 2012 - - 27.55X
WebUK 2006 14.17X 18.00X 95.00X+
Twitter 16.58X 28.81X 92.10X+
Figure 6: Relative execution times for one PageRank itera-
tion for implicit pattern (H =1k) and explicit pattern (graph
in memory) for 1000, 2000, 3000, 4000 and 5000 graph splits.
are. As expected, partitions become more imbalanced as the split
count increases. Nevertheless, in the worst case, the largest partition
on the 5,000 graph split is only 12.24% larger than average.
In Figure 3 we show absolute execution times for the various
communication patterns. In the implicit targeting communication
pattern, the system is able to complete one PageRank iteration in
34.4 seconds. We observe a speedup of 7.2X of the implicit pattern
over the explicit pattern. The overhead of scanning the graph from
disk rather than from memory is negligible in the explicit pattern,
as the bottleneck in this case is communication. The opposite is
true for the implicit pattern, where scanning the graph from disk
is ∼30% slower. In addition to our main web graph we collected
performance results on the largest web graph we were granted
access to, of 96.3 billion vertices and 5.874 trillion edges. For this
we used implicit pattern, H = 10000, 64-bit payloads, and measured
134-second iterations.
Figure 7: Relative execution times for a single PageRank it-
eration for implicit pattern (H = 10000) and explicit pattern
(graph in memory) for 5000 graph splits and 20%, 40%, 60%,
80% graph sub-sampling.
We then examine the number of inlinks to a partition when
source-side aggregation is used for different values of H. First, we
observe that inlink imbalance is indeed a significant problem. Specif-
ically, the largest inlink count is more than five times larger than
average (Figure 4). Inlink count dictates the size of the in-order
vectors for the implicit pattern and therefore has a direct impact
on memory requirements. Source-side aggregation, through the
detection of high-frequency inlinks, resolves the imbalance even
for relatively small numbers of H. For instance, for H = 30000, the
worst-case imbalance is 19.68%. We also measure the performance
of the algorithm for various values of H and observe there is no over-
head when using aggregation. In fact, we observe a performance
increase in accordance with the reduction in overall messages sent
over the network. For instance, when H = 30000, the messages sent
over the network are 97% of all messages, while execution time
improves to 96% compared to the baseline case (see Figure 5).
Finaly, Hronos exhibits strong and weak scalability properties:
(1) we observe that performance improves almost linearly with the
number of graph partitions for both the implicit and explicit pattern
(Figure 6;) and (2) we sub-sample the web graph in order to inspect
performance as the graph size increases while keeping a fixed split
count and observe that the system scales linearly with graph size
(Figure 7).
6 CONCLUSION
In this work, we introduced a novel communication paradigm for
graph processing frameworks that is optimized for PageRank-like
workloads. We evaluated its performance on graphs of up to 96
billion vertices and 5.9 trillion edges and demonstrated more than
an order of magnitude improvements over the state-of-the-art.
ACKNOWLEDGMENTS
We would like to thank Dipen Rughwani for collecting experi-
mental results on Giraph and Kostas Tsioutsiouliklis for insightful
discussions and proof-reading earlier versions of this work.
7. Scaling PageRank to 100 Billion Pages WWW ’20, April 20–24, 2020, Taipei, Taiwan
REFERENCES
[1] 2016. Common Crawl. http://commoncrawl.org/.
[2] 2016. Twitter User Graph. http://konect.uni-koblenz.de/networks/twitter_mpi.
[3] 2016. Web Data Commons. http://webdatacommons.org/.
[4] 2017. Apache Giraph. http://giraph.apache.org/.
[5] 2020. Facebook Company Info. http://newsroom.fb.com/company-info/.
[6] 2020. How Search organizes information. https://www.google.com/search
/howsearchworks/crawling-indexing/.
[7] Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A Large Time-Aware
Graph. SIGIR Forum 42, 2 (2008), 33–38.
[8] RobertS. Boyer and J.Strother Moore. 1991. MJRTY-A Fast Majority Vote Al-
gorithm. In Automated Reasoning, RobertS. Boyer (Ed.). Automated Reasoning
Series, Vol. 1. Springer Netherlands, 105–117.
[9] Yingyi Bu, Bill Howe, Magdalena Balazinska, and Michael D Ernst. 2010. HaLoop:
Efficient iterative data processing on large clusters. Proceedings of the VLDB
Endowment 3, 1-2 (2010), 285–296.
[10] Aydın Buluç and John R Gilbert. 2011. The Combinatorial BLAS: Design, im-
plementation, and applications. The International Journal of High Performance
Computing Applications 25, 4 (2011), 496–509.
[11] Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. Powerlyra: Differen-
tiated graph computation and partitioning on skewed graphs. In Proceedings of
the Tenth European Conference on Computer Systems. ACM, 1.
[12] Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi
Muthukrishnan. 2015. One Trillion Edges: Graph Processing at Facebook-Scale.
Proceedings of the VLDB Endowment 8, 12 (2015).
[13] Graham Cormode and Marios Hadjieleftheriou. 2008. Finding Frequent Items in
Data Streams. Proceedings of the VLDB Endowment 1, 2 (2008), 1530–1541.
[14] JE Gonzalez, Y Low, H Gu, D Bickson, and C Guestrin. 2012. PowerGraph:
Distributed Graph-Parallel Computation on Natural Graphs. OSDI (2012).
[15] JE Gonzalez, RS Xin, and A Dave. [n.d.]. GraphX: Graph Processing in a Dis-
tributed Dataflow Framework. In OSDI’14 Proceedings of the 11th USENIX confer-
ence on Operating Systems Design and Implementation.
[16] Minyang Han, Khuzaima Daudjee, Khaled Ammar, M. Tamer Özsu, Xingfang
Wang, and Tianqi Jin. 2014. An experimental comparison of pregel-like graph
processing systems. Proceedings of the VLDB Endowment 7, 12 (Aug. 2014), 1047–
1058. https://doi.org/10.14778/2732977.2732980
[17] Wassily Hoeffding. 1963. Probability Inequalities for Sums of Bounded Random
Variables. Journal of the American Statistical Association 58, 301 (1963), 13–30.
[18] Shan Jiang, Yuening Hu, Changsung Kang, Tim Daly Jr, Dawei Yin, Yi Chang,
and Chengxiang Zhai. 2016. Learning Query and Document Relevance from
a Web-scale Click Graph. In Proceedings of the 39th International ACM SIGIR
conference on Research and Development in Information Retrieval. ACM, 185–194.
[19] U Kang, Charalampos E Tsourakakis, and Christos Faloutsos. 2009. Pegasus:
A peta-scale graph mining system implementation and observations. In Data
Mining, 2009. ICDM’09. Ninth IEEE International Conference on. IEEE, 229–238.
[20] Zuhair Khayyat, Karim Awara, Amani Alonazi, Hani Jamjoom, Dan Williams,
and Panos Kalnis. 2013. Mizan: a system for dynamic load balancing in large-
scale graph processing. In Proceedings of the 8th ACM European Conference on
Computer Systems. ACM, 169–182.
[21] Aapo Kyrola, Guy E Blelloch, Carlos Guestrin, et al. 2012. GraphChi: Large-Scale
Graph Computation on Just a PC.. In OSDI, Vol. 12. 31–46.
[22] AN Langville and CD Meyer. 2011. Google’s PageRank and beyond: The science of
search engine rankings.
[23] Yibei Ling and Wei Sun. 1992. A Supplement to Sampling-based Methods for
Query Size Estimation in a Database System. SIGMOD Record 21, 4 (Dec. 1992),
12–15.
[24] G Malewicz, MH Austern, and AJC Bik. 2010. Pregel: a system for large-scale
graph processing. In Proceedings of the 2010 ACM SIGMOD International Confer-
ence on Management of data.
[25] Derek G Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham,
and Martín Abadi. 2013. Naiad: a timely dataflow system. In Proceedings of the
Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 439–455.
[26] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The
PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab (1999).
[27] Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, and Willy Zwaenepoel.
2015. Chaos: Scale-out graph processing from secondary storage. In Proceedings
of the 25th Symposium on Operating Systems Principles. ACM, 410–424.
[28] Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: edge-
centric graph processing using streaming partitions. In Proceedings of the Twenty-
Fourth ACM Symposium on Operating Systems Principles. ACM, 472–488.
[29] Semih Salihoglu and Jennifer Widom. 2013. GPS: a graph processing system.
In Proceedings of the 25th International Conference on Scientific and Statistical
Database Management - SSDBM. ACM Press, 1. https://doi.org/10.1145/2484838.
2484843
[30] Stergios Stergiou, Dipen Rughwani, and Kostas Tsioutsiouliklis. 2018. Shortcut-
ting label propagation for distributed connected components. In Proceedings of
the Eleventh ACM International Conference on Web Search and Data Mining. ACM,
540–546.
[31] Stergios Stergiou, Zygimantas Straznickas, Rolina Wu, and Kostas Tsioutsiouliklis.
[n.d.]. Distributed Negative Sampling For Word Embeddings. In AAAI’17 Thirty-
First AAAI Conference on Artificial Intelligence.
[32] Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, and
John McPherson. 2013. From think like a vertex to think like a graph. Proceedings
of the VLDB Endowment 7, 3 (2013), 193–204.
[33] Da Yan, James Cheng, Yi Lu, and Wilfred Ng. 2014. Blogel: A block-centric
framework for distributed computation on real-world graphs. Proceedings of the
VLDB Endowment 7, 14 (2014), 1981–1992.
[34] Da Yan, James Cheng, Yi Lu, and Wilfred Ng. 2015. Effective techniques for
message reduction and load balancing in distributed graph computation. In
Proceedings of the 24th International Conference on World Wide Web. 1307–1317.
[35] H Zhao and J Canny. 2014. Kylix: A Sparse Allreduce for Commodity Clusters.
In 43rd International Conference on Parallel Processing (ICPP).