Distributed Coloring with O˜(√log n) Bits
K Kothapalli, M Onus, C Scheideler, C Schindelhauer
Proc. of IEEE International Parallel and Distributed Processing Symposium …
We consider the well-known vertex coloring problem: given a graph G, find a coloring of its vertices so that no two neighbors in G have the same color. It is trivial to see that every graph of maximum degree∆ can be colored with∆+ 1 colors, and distributed algorithms that find a (∆+ 1)-coloring in a logarithmic number of communication rounds, with high probability, are known since more than a decade. This is in general the best possible if only a constant number of bits can be sent along every edge in each round. In fact, we show that for the n-node cycle the bit complexity of the coloring problem is
Ω (log n). More precisely, if only one bit can be sent along each edge in a round, then every distributed coloring algorithm (ie, algorithms in which every node has the same initial state and initially only knows its own edges) needs at least Ω (log n) rounds, with high probability, to color the n–node cycle, for any finite number of colors. But what if the edges have orientations, ie, the endpoints of an edge agree on its orientation (while bits may still flow in both directions)? Edge orientations naturally occur in dynamic networks where new nodes establish connections to old nodes. Does this allow one to provide faster coloring algorithms?
In this presentation we show that there exist graphs which greedy routing thak long time. This show that Small world is a Social phenomena and not mathematical one.
SATISFIABILITY METHODS FOR COLOURING GRAPHScscpconf
The graph colouring problem can be solved using methods based on Satisfiability (SAT). An instance of the problem is defined by a Boolean expression written using Boolean variables and the logical connectives AND, OR and NOT. It has to be determined whether there is an assignment of TRUE and FALSE values to the variables that makes the entire expression true.A SAT problem is syntactically and semantically quite simple. Many Constraint Satisfaction Problems (CSPs)in AI and OR can be formulated in SAT. These make use of two kinds of
searchalgorithms: Deterministic and Randomized.It has been found that deterministic methods when run on hard CSP instances are frequently very slow in execution.A deterministic method always outputs a solution in the end, but it can take an enormous amount of time to do so.This has led to the development of randomized search algorithms like GSAT, which are typically based on local (i.e., neighbourhood) search. Such methodshave been applied very successfully to find good solutions to hard decision problems
Graph coloring is an important concept in graph theory. It is a special kind of problem in which we have assign colors to certain elements of the graph along with certain constraints. Suppose we are given K colors, we have to color the vertices in such a way that no two adjacent vertices of the graph have the same color, this is known as vertex coloring, similarly we have edge coloring and face coloring. The coloring problem has a huge number of applications in modern computer science such as making schedule of time table , Sudoku, Bipartite graphs , Map coloring, data mining, networking. In this paper we are going to focus on certain applications like Final exam timetabling, Aircraft Scheduling, guarding an art gallery.
In this presentation we show that there exist graphs which greedy routing thak long time. This show that Small world is a Social phenomena and not mathematical one.
SATISFIABILITY METHODS FOR COLOURING GRAPHScscpconf
The graph colouring problem can be solved using methods based on Satisfiability (SAT). An instance of the problem is defined by a Boolean expression written using Boolean variables and the logical connectives AND, OR and NOT. It has to be determined whether there is an assignment of TRUE and FALSE values to the variables that makes the entire expression true.A SAT problem is syntactically and semantically quite simple. Many Constraint Satisfaction Problems (CSPs)in AI and OR can be formulated in SAT. These make use of two kinds of
searchalgorithms: Deterministic and Randomized.It has been found that deterministic methods when run on hard CSP instances are frequently very slow in execution.A deterministic method always outputs a solution in the end, but it can take an enormous amount of time to do so.This has led to the development of randomized search algorithms like GSAT, which are typically based on local (i.e., neighbourhood) search. Such methodshave been applied very successfully to find good solutions to hard decision problems
Graph coloring is an important concept in graph theory. It is a special kind of problem in which we have assign colors to certain elements of the graph along with certain constraints. Suppose we are given K colors, we have to color the vertices in such a way that no two adjacent vertices of the graph have the same color, this is known as vertex coloring, similarly we have edge coloring and face coloring. The coloring problem has a huge number of applications in modern computer science such as making schedule of time table , Sudoku, Bipartite graphs , Map coloring, data mining, networking. In this paper we are going to focus on certain applications like Final exam timetabling, Aircraft Scheduling, guarding an art gallery.
A Simple Method to Build a Paper-Based Color Check Print of Colored Fabrics b...CSCJournals
An open loop color management system is implemented to reproduce an analog color of a set of colored fabrics by a digital inkjet printer. A tetrahedral interpolation technique is designed for mapping between device-dependent (RGB) and device-independent (CIELAB) color spaces. A set of 3164 color patches are used as training set in 3-D LookUp Table (LUT) to characterize the color printer. Then, the designed color management system is examined by the colorimetric reproduction of a set of 30 colored fabrics using the conventional inkjet printer. The performance of the system is numerically evaluated by measuring the color difference values between the original and the reproduced samples. The results showed that the color reproduction system appropriately works for both groups of samples located inside the color gamut of output device, i.e. printer, and those out of gamut samples while the later logically leads to greater errors.
Chromatic Number of a Graph (Graph Colouring)Adwait Hegde
A graph coloring is an assignment of labels, called colors, to the vertices of a graph such that no two adjacent vertices share the same color. The chromatic number χ(G) of a graph G is the minimal number of colors for which such an assignment is possible.
Given a graphG=(V,E) and a set T of non-negative integers containing 0, a T -coloring of G is an integer function f of the vertices of G such that |f(u)-f(v)|∉T whenever uv∈E. The edge-span of a T-coloring is the maximum value of |f(u)-f(v)| over all edges uv, and the T- edge-span of a graph G is the minimum value of the edge-span among all possible T -colorings of G. This paper discusses the T -edge span of the folded hypercube network of dimension n for the k-multiple-of-s set, T={0,s,2s,…ks}∪S, where s and k≥1 and S⊆{s+1,s+2,…ks-1}.
We provide a review of the recent literature on statistical risk bounds for deep neural networks. We also discuss some theoretical results that compare the performance of deep ReLU networks to other methods such as wavelets and spline-type methods. The talk will moreover highlight some open problems and sketch possible new directions.
A linear algorithm for the grundy number of a treeijcsit
A coloring of a graph G = (V ,E) is a partition {V1, V2, . . . , Vk} of V into independent sets or color classes.
A vertex v
Vi is a Grundy vertex if it is adjacent to at least one vertex in each color class Vj for every j <i.
A coloring is a Grundy coloring if every color class contains at least one Grundy vertex, and the Grundy
number of a graph is the maximum number of colors in a Grundy coloring. We derive a natural upper
bound on this parameter and show that graphs with sufficiently large girth achieve equality in the bound.
In particular, this gives a linear-time algorithm to determine the Grundy number of a tree.
A Visual Exploration of Distance, Documents, and DistributionsRebecca Bilbro
Machine learning often requires us to think spatially and make choices about what it means for two instances to be close or far apart. So which is best - Euclidean? Manhattan? Cosine? It all depends! In this talk, we'll explore open source tools and visual diagnostic strategies for picking good distance metrics when doing machine learning on text.
Developing visual material can help to recall memory and also be a quick way to show lots of information. Visualization helps us remember (like when we try to picture where we’ve parked our car, and what's in our cupboards when writing a shopping list). We can create diagrams and visual aids depicting module materials and put them up around the house so that we are constantly reminded of our learning
A Simple Method to Build a Paper-Based Color Check Print of Colored Fabrics b...CSCJournals
An open loop color management system is implemented to reproduce an analog color of a set of colored fabrics by a digital inkjet printer. A tetrahedral interpolation technique is designed for mapping between device-dependent (RGB) and device-independent (CIELAB) color spaces. A set of 3164 color patches are used as training set in 3-D LookUp Table (LUT) to characterize the color printer. Then, the designed color management system is examined by the colorimetric reproduction of a set of 30 colored fabrics using the conventional inkjet printer. The performance of the system is numerically evaluated by measuring the color difference values between the original and the reproduced samples. The results showed that the color reproduction system appropriately works for both groups of samples located inside the color gamut of output device, i.e. printer, and those out of gamut samples while the later logically leads to greater errors.
Chromatic Number of a Graph (Graph Colouring)Adwait Hegde
A graph coloring is an assignment of labels, called colors, to the vertices of a graph such that no two adjacent vertices share the same color. The chromatic number χ(G) of a graph G is the minimal number of colors for which such an assignment is possible.
Given a graphG=(V,E) and a set T of non-negative integers containing 0, a T -coloring of G is an integer function f of the vertices of G such that |f(u)-f(v)|∉T whenever uv∈E. The edge-span of a T-coloring is the maximum value of |f(u)-f(v)| over all edges uv, and the T- edge-span of a graph G is the minimum value of the edge-span among all possible T -colorings of G. This paper discusses the T -edge span of the folded hypercube network of dimension n for the k-multiple-of-s set, T={0,s,2s,…ks}∪S, where s and k≥1 and S⊆{s+1,s+2,…ks-1}.
We provide a review of the recent literature on statistical risk bounds for deep neural networks. We also discuss some theoretical results that compare the performance of deep ReLU networks to other methods such as wavelets and spline-type methods. The talk will moreover highlight some open problems and sketch possible new directions.
A linear algorithm for the grundy number of a treeijcsit
A coloring of a graph G = (V ,E) is a partition {V1, V2, . . . , Vk} of V into independent sets or color classes.
A vertex v
Vi is a Grundy vertex if it is adjacent to at least one vertex in each color class Vj for every j <i.
A coloring is a Grundy coloring if every color class contains at least one Grundy vertex, and the Grundy
number of a graph is the maximum number of colors in a Grundy coloring. We derive a natural upper
bound on this parameter and show that graphs with sufficiently large girth achieve equality in the bound.
In particular, this gives a linear-time algorithm to determine the Grundy number of a tree.
A Visual Exploration of Distance, Documents, and DistributionsRebecca Bilbro
Machine learning often requires us to think spatially and make choices about what it means for two instances to be close or far apart. So which is best - Euclidean? Manhattan? Cosine? It all depends! In this talk, we'll explore open source tools and visual diagnostic strategies for picking good distance metrics when doing machine learning on text.
Developing visual material can help to recall memory and also be a quick way to show lots of information. Visualization helps us remember (like when we try to picture where we’ve parked our car, and what's in our cupboards when writing a shopping list). We can create diagrams and visual aids depicting module materials and put them up around the house so that we are constantly reminded of our learning
I am Dennis L. I am an Algorithm Design Exam Expert at programmingexamhelp.com. I hold a Ph.D. in Computer Science from, the City University of New York. I have been helping students with their exams for the past 7 years. You can hire me to take your exam in Algorithm Design.
Visit programmingexamhelp.com or email support@programmingexamhelp.com. You can also call on +1 678 648 4277 for any assistance with the Algorithm Design Exam.
I am Joe L. I am an Algorithms Design Assignment Expert at programminghomeworkhelp.com. I hold a Ph.D. in Programming from, the University of Chicago, USA. I have been helping students with their assignments for the past 10 years. I solve assignments related to Algorithms Design.
Visit programminghomeworkhelp.com or email support@programminghomeworkhelp.com. You can also call on +1 678 648 4277 for any assistance with the Algorithm Design Assignment.
Chapter summary and solutions to end-of-chapter exercises for "Data Visualization: Principles and Practice" book by Alexandru C. Telea
In this chapter author discusses a number of popular visualization methods for vector datasets: vector glyphs, vector color-coding, displacement plots, stream objects, texture-based vector visualization, and the simplified representation of vector fields.
Section 6.5 presents stream objects, which use integral techniques to construct paths in vector fields. Section 6.7 discusses a number of strategies for simplified representation of vector datasets. Section 6.8 presents a number of illustrative visualization techniques for vector fields, which offer an alternative mechanism for simplified representation to the techniques discussed in Section 6.7 Chapter presents also feature detection methods, algorithm for computing separatrices on field’s topology, and top-down and bottom-up field decomposition methods.
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYFransiskeran
Since the ancient determination of the five platonic solids the study of symmetry and regularity has always
been one of the most fascinating aspects of mathematics. One intriguing phenomenon of studies in graph
theory is the fact that quite often arithmetic regularity properties of a graph imply the existence of many
symmetries, i.e. large automorphism group G. In some important special situation higher degree of
regularity means that G is an automorphism group of finite geometry. For example, a glance through the
list of distance regular graphs of diameter d < 3 reveals the fact that most of them are connected with
classical Lie geometry. Theory of distance regular graphs is an important part of algebraic combinatorics
and its applications such as coding theory, communication networks, and block design. An important tool
for investigation of such graphs is their spectra, which is the set of eigenvalues of adjacency matrix of a
graph. Let G be a finite simple group of Lie type and X be the set homogeneous elements of the associated
geometry.
An analysis between different algorithms for the graph vertex coloring problem IJECEIAES
This research focuses on an analysis of different algorithms for the graph vertex coloring problem. Some approaches to solving the problem are discussed. Moreover, some studies for the problem and several methods for its solution are analyzed as well. An exact algorithm (using the backtracking method) is presented. The complexity analysis of the algorithm is discussed. Determining the average execution time of the exact algorithm is consistent with the multitasking mode of the operating system. This algorithm generates optimal solutions for all studied graphs. In addition, two heuristic algorithms for solving the graph vertex coloring problem are used as well. The results show that the exact algorithm can be used to solve the graph vertex coloring problem for small graphs with 30-35 vertices. For half of the graphs, all three algorithms have found the optimal solutions. The suboptimal solutions generated by the approximate algorithms are identical in terms of the number of colors needed to color the corresponding graphs. The results show that the linear increase in the number of vertices and edges of the analyzed graphs causes a linear increase in the number of colors needed to color these graphs.
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...cseiitgn
The field of designing subexponential time parameterized algorithms has gained a lot of momentum lately. While the subexponential time algorithm for Planar-k-Path (finding a path of length at least k on planar graphs) has been known for last 15 years. There was no such algorithms known on directed planar graphs. In this talk, I will survey this journey of designing subexponential time parameterized algorithms for finding a path of length at least k in planar undirected graphs to planar directed graphs; highlighting the new tools and techniques that got developed on the way.
Graph in data structure it gives you the information of the graph application. How to represent the Graph and also Graph Travesal is also there many terms are there related to garph
Similar to Distributed coloring with O(sqrt. log n) bits (20)
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...Subhajit Sahu
TrueTime is a service that enables the use of globally synchronized clocks, with bounded error. It returns a time interval that is guaranteed to contain the clock’s actual time for some time during the call’s execution. If two intervals do not overlap, then we know calls were definitely ordered in real time. In general, synchronized clocks can be used to avoid communication in a distributed system.
The underlying source of time is a combination of GPS receivers and atomic clocks. As there are “time masters” in every datacenter (redundantly), it is likely that both sides of a partition would continue to enjoy accurate time. Individual nodes however need network connectivity to the masters, and without it their clocks will drift. Thus, during a partition their intervals slowly grow wider over time, based on bounds on the rate of local clock drift. Operations depending on TrueTime, such as Paxos leader election or transaction commits, thus have to wait a little longer, but the operation still completes (assuming the 2PC and quorum communication are working).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting Bitset for graph : SHORT REPORT / NOTESSubhajit Sahu
Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is commonly used for efficient graph computations. Unfortunately, using CSR for dynamic graphs is impractical since addition/deletion of a single edge can require on average (N+M)/2 memory accesses, in order to update source-offsets and destination-indices. A common approach is therefore to store edge-lists/destination-indices as an array of arrays, where each edge-list is an array belonging to a vertex. While this is good enough for small graphs, it quickly becomes a bottleneck for large graphs. What causes this bottleneck depends on whether the edge-lists are sorted or unsorted. If they are sorted, checking for an edge requires about log(E) memory accesses, but adding an edge on average requires E/2 accesses, where E is the number of edges of a given vertex. Note that both addition and deletion of edges in a dynamic graph require checking for an existing edge, before adding or deleting it. If edge lists are unsorted, checking for an edge requires around E/2 memory accesses, but adding an edge requires only 1 memory access.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Experiments with Primitive operations : SHORT REPORT / NOTESSubhajit Sahu
This includes:
- Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
- Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
- Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
- Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
Below are the important points I note from the 2020 paper by Martin Grohe:
- 1-WL distinguishes almost all graphs, in a probabilistic sense
- Classical WL is two dimensional Weisfeiler-Leman
- DeepWL is an unlimited version of WL graph that runs in polynomial time.
- Knowledge graphs are essentially graphs with vertex/edge attributes
ABSTRACT:
Vector representations of graphs and relational structures, whether handcrafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view.
Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESSubhajit Sahu
https://gist.github.com/wolfram77/54c4a14d9ea547183c6c7b3518bf9cd1
There exist a number of dynamic graph generators. Barbasi-Albert model iteratively attach new vertices to pre-exsiting vertices in the graph using preferential attachment (edges to high degree vertices are more likely - rich get richer - Pareto principle). However, graph size increases monotonically, and density of graph keeps increasing (sparsity decreasing).
Gorke's model uses a defined clustering to uniformly add vertices and edges. Purohit's model uses motifs (eg. triangles) to mimick properties of existing dynamic graphs, such as growth rate, structure, and degree distribution. Kronecker graph generators are used to increase size of a given graph, with power-law distribution.
To generate dynamic graphs, we must choose a metric to compare two graphs. Common metrics include diameter, clustering coefficient (modularity?), triangle counting (triangle density?), and degree distribution.
In this paper, the authors propose Dygraph, a dynamic graph generator that uses degree distribution as the only metric. The authors observe that many real-world graphs differ from the power-law distribution at the tail end. To address this issue, they propose binning, where the vertices beyond a certain degree (minDeg = min(deg) s.t. |V(deg)| < H, where H~10 is the number of vertices with a given degree below which are binned) are grouped into bins of degree-width binWidth, max-degree localMax, and number of degrees in bin with at least one vertex binSize (to keep track of sparsity). This helps the authors to generate graphs with a more realistic degree distribution.
The process of generating a dynamic graph is as follows. First the difference between the desired and the current degree distribution is calculated. The authors then create an edge-addition set where each vertex is present as many times as the number of additional incident edges it must recieve. Edges are then created by connecting two vertices randomly from this set, and removing both from the set once connected. Currently, authors reject self-loops and duplicate edges. Removal of edges is done in a similar fashion.
Authors observe that adding edges with power-law properties dominates the execution time, and consider parallelizing DyGraph as part of future work.
My notes on shared memory parallelism.
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Using memory for communication inside a single program, e.g. among its multiple threads, is also referred to as shared memory [REF].
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESSubhajit Sahu
**Community detection methods** can be *global* or *local*. **Global community detection methods** divide the entire graph into groups. Existing global algorithms include:
- Random walk methods
- Spectral partitioning
- Label propagation
- Greedy agglomerative and divisive algorithms
- Clique percolation
https://gist.github.com/wolfram77/b4316609265b5b9f88027bbc491f80b6
There is a growing body of work in *detecting overlapping communities*. **Seed set expansion** is a **local community detection method** where a relevant *seed vertices* of interest are picked and *expanded to form communities* surrounding them. The quality of each community is measured using a *fitness function*.
**Modularity** is a *fitness function* which compares the number of intra-community edges to the expected number in a random-null model. **Conductance** is another popular fitness score that measures the community cut or inter-community edges. Many *overlapping community detection* methods **use a modified ratio** of intra-community edges to all edges with atleast one endpoint in the community.
Andersen et al. use a **Spectral PageRank-Nibble method** which minimizes conductance and is formed by adding vertices in order of decreasing PageRank values. Andersen and Lang develop a **random walk approach** in which some vertices in the seed set may not be placed in the final community. Clauset gives a **greedy method** that *starts from a single vertex* and then iteratively adds neighboring vertices *maximizing the local modularity score*. Riedy et al. **expand multiple vertices** via maximizing modularity.
Several algorithms for **detecting global, overlapping communities** use a *greedy*, *agglomerative approach* and run *multiple separate seed set expansions*. Lancichinetti et al. run **greedy seed set expansions**, each with a *single seed vertex*. Overlapping communities are produced by a sequentially running expansions from a node not yet in a community. Lee et al. use **maximal cliques as seed sets**. Havemann et al. **greedily expand cliques**.
The authors of this paper discuss a dynamic approach for **community detection using seed set expansion**. Simply marking the neighbours of changed vertices is a **naive approach**, and has *severe shortcomings*. This is because *communities can split apart*. The simple updating method *may fail even when it outputs a valid community* in the graph.
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESSubhajit Sahu
A **community** (in a network) is a subset of nodes which are _strongly connected among themselves_, but _weakly connected to others_. Neither the number of output communities nor their size distribution is known a priori. Community detection methods can be divisive or agglomerative. **Divisive methods** use _betweeness centrality_ to **identify and remove bridges** between communities. **Agglomerative methods** greedily **merge two communities** that provide maximum gain in _modularity_. Newman and Girvan have introduced the **modularity metric**. The problem of community detection is then reduced to the problem of modularity maximization which is **NP-complete**. **Louvain method** is a variant of the _agglomerative strategy_, in that is a _multi-level heuristic_.
https://gist.github.com/wolfram77/917a1a4a429e89a0f2a1911cea56314d
In this paper, the authors discuss **four heuristics** for Community detection using the _Louvain algorithm_ implemented upon recently developed **Grappolo**, which is a parallel variant of the Louvain algorithm. They are:
- Vertex following and Minimum label
- Data caching
- Graph coloring
- Threshold scaling
With the **Vertex following** heuristic, the _input is preprocessed_ and all single-degree vertices are merged with their corresponding neighbours. This helps reduce the number of vertices considered in each iteration, and also help initial seeds of communities to be formed. With the **Minimum label heuristic**, when a vertex is making the decision to move to a community and multiple communities provided the same modularity gain, the community with the smallest id is chosen. This helps _minimize or prevent community swaps_. With the **Data caching** heuristic, community information is stored in a vector instead of a map, and is reused in each iteration, but with some additional cost. With the **Vertex ordering via Graph coloring** heuristic, _distance-k coloring_ of graphs is performed in order to group vertices into colors. Then, each set of vertices (by color) is processed _concurrently_, and synchronization is performed after that. This enables us to mimic the behaviour of the serial algorithm. Finally, with the **Threshold scaling** heuristic, _successively smaller values of modularity threshold_ are used as the algorithm progresses. This allows the algorithm to converge faster, and it has been observed a good modularity score as well.
From the results, it appears that _graph coloring_ and _threshold scaling_ heuristics do not always provide a speedup and this depends upon the nature of the graph. It would be interesting to compare the heuristics against baseline approaches. Future work can include _distributed memory implementations_, and _community detection on streaming graphs_.
Application Areas of Community Detection: A Review : NOTESSubhajit Sahu
This is a short review of Community detection methods (on graphs), and their applications. A **community** is a subset of a network whose members are *highly connected*, but *loosely connected* to others outside their community. Different community detection methods *can return differing communities* these algorithms are **heuristic-based**. **Dynamic community detection** involves tracking the *evolution of community structure* over time.
https://gist.github.com/wolfram77/09e64d6ba3ef080db5558feb2d32fdc0
Communities can be of the following **types**:
- Disjoint
- Overlapping
- Hierarchical
- Local.
The following **static** community detection **methods** exist:
- Spectral-based
- Statistical inference
- Optimization
- Dynamics-based
The following **dynamic** community detection **methods** exist:
- Independent community detection and matching
- Dependent community detection (evolutionary)
- Simultaneous community detection on all snapshots
- Dynamic community detection on temporal networks
**Applications** of community detection include:
- Criminal identification
- Fraud detection
- Criminal activities detection
- Bot detection
- Dynamics of epidemic spreading (dynamic)
- Cancer/tumor detection
- Tissue/organ detection
- Evolution of influence (dynamic)
- Astroturfing
- Customer segmentation
- Recommendation systems
- Social network analysis (both)
- Network summarization
- Privary, group segmentation
- Link prediction (both)
- Community evolution prediction (dynamic, hot field)
<br>
<br>
## References
- [Application Areas of Community Detection: A Review : PAPER](https://ieeexplore.ieee.org/document/8625349)
This paper discusses a GPU implementation of the Louvain community detection algorithm. Louvain algorithm obtains hierachical communities as a dendrogram through modularity optimization. Given an undirected weighted graph, all vertices are first considered to be their own communities. In the first phase, each vertex greedily decides to move to the community of one of its neighbours which gives greatest increase in modularity. If moving to no neighbour's community leads to an increase in modularity, the vertex chooses to stay with its own community. This is done sequentially for all the vertices. If the total change in modularity is more than a certain threshold, this phase is repeated. Once this local moving phase is complete, all vertices have formed their first hierarchy of communities. The next phase is called the aggregation phase, where all the vertices belonging to a community are collapsed into a single super-vertex, such that edges between communities are represented as edges between respective super-vertices (edge weights are combined), and edges within each community are represented as self-loops in respective super-vertices (again, edge weights are combined). Together, the local moving and the aggregation phases constitute a stage. This super-vertex graph is then used as input fof the next stage. This process continues until the increase in modularity is below a certain threshold. As a result from each stage, we have a hierarchy of community memberships for each vertex as a dendrogram.
Approaches to perform the Louvain algorithm can be divided into coarse-grained and fine-grained. Coarse-grained approaches process a set of vertices in parallel, while fine-grained approaches process all vertices in parallel. A coarse-grained hybrid-GPU algorithm using multi GPUs has be implemented by Cheong et al. which grabbed my attention. In addition, their algorithm does not use hashing for the local moving phase, but instead sorts each neighbour list based on the community id of each vertex.
https://gist.github.com/wolfram77/7e72c9b8c18c18ab908ae76262099329
Survey for extra-child-process package : NOTESSubhajit Sahu
Useful additions to inbuilt child_process module.
📦 Node.js, 📜 Files, 📰 Docs.
Please see attached PDF for literature survey.
https://gist.github.com/wolfram77/d936da570d7bf73f95d1513d4368573e
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERSubhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/692d263f463fd49be6eb5aa65dd4d0f9
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...Subhajit Sahu
For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024
Fast Incremental Community Detection on Dynamic Graphs : NOTESSubhajit Sahu
In this paper, the authors describe two approaches for dynamic community detection using the CNM algorithm. CNM is a hierarchical, agglomerative algorithm that greedily maximizes modularity. They define two approaches: BasicDyn and FastDyn. BasicDyn backtracks merges of communities until each marked (changed) vertex is its own singleton community. FastDyn undoes a merge only if the quality of merge, as measured by the induced change in modularity, has significantly decreased compared to when the merge initially took place. FastDyn also allows more than two vertices to contract together if in the previous time step these vertices eventually ended up contracted in the same community. In the static case, merging several vertices together in one contraction phase could lead to deteriorating results. FastDyn is able to do this, however, because it uses information from the merges of the previous time step. Intuitively, merges that previously occurred are more likely to be acceptable later.
https://gist.github.com/wolfram77/1856b108334cc822cdddfdfa7334792a
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
1. Distributed Coloring with ˜O(
√
log n) Bits ∗
Kishore Kothapalli
Department of Computer Science
Johns Hopkins University
3400 N. Charles Street
Baltimore, MD 21218, USA
kishore@cs.jhu.edu
Christian Scheideler †
Department of Computer Science
Johns Hopkins University
3400 N. Charles Street
Baltimore, MD 21218, USA
scheideler@cs.jhu.edu
Melih Onus
Department of Computer Science
Arizona State University
Tempe, AZ 85287, USA
melih@asu.edu
Christian Schindelhauer
Heinz Nixdorf Institute and
Computer Science Department
University of Paderborn
33102 Paderborn, Germany
schindel@uni-paderborn.de
Abstract
We consider the well-known vertex coloring problem: given a graph G, find a coloring of its vertices
so that no two neighbors in G have the same color. It is trivial to see that every graph of maximum
degree ∆ can be colored with ∆ + 1 colors, and distributed algorithms that find a (∆ + 1)-coloring
in a logarithmic number of communication rounds, with high probability, are known since more than a
decade. This is in general the best possible if only a constant number of bits can be sent along every edge
in each round. In fact, we show that for the n-node cycle the bit complexity of the coloring problem is
Ω(log n). More precisely, if only one bit can be sent along each edge in a round, then every distributed
coloring algorithm (i.e., algorithms in which every node has the same initial state and initially only
knows its own edges) needs at least Ω(log n) rounds, with high probability, to color the n–node cycle,
for any finite number of colors. But what if the edges have orientations, i.e., the endpoints of an edge
agree on its orientation (while bits may still flow in both directions)? Edge orientations naturally occur in
dynamic networks where new nodes establish connections to old nodes. Does this allow one to provide
faster coloring algorithms?
Interestingly, for the n–node cycle in which all edges have the same orientation, we show that a
simple randomized algorithm can achieve a 3-coloring with only O(
√
log n) rounds of bit transmissions,
with high probability (w.h.p.). This result is tight because we also show that the bit complexity of
coloring an n–node oriented cycle is Ω(
√
log n), with high probability, no matter how many colors are
allowed. The 3-coloring algorithm can be easily extended to provide a (∆ + 1)-coloring for all graphs
of maximum degree ∆ in O(
√
log n) rounds of bit transmissions, w.h.p., if ∆ is a constant, the edges
are oriented, and the graph does not contain an oriented cycle of length less than
√
log n. Using more
complex algorithms, we show how to obtain an O(∆)-coloring for arbitrary oriented graphs of maximum
degree ∆ using essentially O(log ∆+
√
log n) rounds of bit transmissions, w.h.p., provided that the graph
does not contain an oriented cycle of length less than
√
log n.
∗
A preliminary version of the results contained in this paper appeared in the IEEE Internation Parallel and Distributed Processing
Symposium (IPDPS), 2006 [18].
†
Supported by NSF grants CCR-0311121 and CCR-0311795
1
2. 1 Introduction
A fundamental problem in distributed systems is to compute a proper vertex coloring. The importance of
vertex coloring can be seen by observing that many distributed algorithms use such a coloring as a sub-
routine in higher-order communication and computation tasks. Examples include scheduling [20], resource
allocation [3], and synchronization. Vertex coloring has applications also in wireless networks to determine
cluster heads, (see for example [17] and the references therein), routing in wireless networks [20], and in
many parallel algorithms [15, 16]. Thus, it is not surprising that this problem has been heavily studied not
only in the distributed setting but also in the PRAM model of computation starting with Karp and Wigderson
[16] and Luby [22].
We consider distributed systems that can be modeled as a graph G = (V, E) with nodes representing the
processors and the edges representing the communication links. Given a graph G = (V, E) with maximum
degree ∆, the vertex coloring problem is to find a color assignment for the vertices of G so that no two
adjacent vertices are given the same color. The minimum number of colors required to properly color a
graph is called its chromatic number, and is denoted by χ(G). While it is easy to see that a graph with
maximum degree ∆ can be colored using at most ∆ + 1 colors, computing the chromatic number of a
graph is NP–hard [9]. Further, χ(G) cannot be approximated to any reasonable bound in general [6]. Thus,
efficient algorithms that color using ∆ + 1 colors are of interest.
In the distributed model of computing, communication is an expensive resource and distributed algo-
rithms therefore aim at using as little communication as possible. Distributed algorithms for vertex coloring
take the approach of minimizing the number of communication rounds assuming that in each round a rea-
sonable number of bits can be communicated. Deterministic distributed algorithms for (∆ + 1)-coloring
that run in a polylogarithmic number of rounds are not known. The best known deterministic algorithm [27]
requires nO(1/
√
log n) rounds where n is the number of vertices. However, randomization can improve the
runtime exponentially and in some special cases, such as highly dense graphs, even double exponentially
[12]. Randomized distributed algorithms that compute a (∆ + 1)–coloring in O(log n) rounds, with high
probability1, are known since more than a decade [23, 14]. In this work we show that, interestingly, if the
underlying graph G is provided with an orientation on its edges such that the orientation does not induce
oriented cycles of length at most
√
log n, then vertex coloring with (1 + )∆ colors for a constant > 0,
can be obtained by exchanging essentially O(log ∆ +
√
log n) bits, with high probability. Thus, we show
that having orientations on the edges significantly improves the performance of distributed vertex coloring
algorithms. We refer the reader to Section 1.3 for precise statements regarding our results.
We also note that providing an orientation is not cumbersome. If the nodes have unique labels that are
taken from a set with a total order then the labels induce a natural acyclic orientation on the edges. Edge
e = (u, v) is oriented from u → v if label(u) > label(v) and vice-versa. Another natural orientation can be
provided as follows, for example in sensor networks. Information gathering is an important communication
primitive for sensor networks where all the packets have to be forwarded to a single common destination
called the observer [13, 19]. Many protocols for information gathering in sensor networks [13, 19] assume
that the direction to observer is available. In such a scenario, an orientation for the edges can be provided
according to the distance of the endpoints to the observer. Ties between nodes with equal distance to the
observer can be broken arbitrarily and the resulting orientation will be free of cycles.
1.1 Model and Definitions
We model the distributed system as a graph G = (V, E) with V representing the set of computing entities,
or processors, and E ⊆ V × V representing all the available communication links. We assume that all the
communication links are undirected and hence bidirectional. All the processors start at the same time and
1
With high probability (w.h.p.) means a probability that is at least 1 − (1/nc
) for c ≥ 1.
1
3. (a) (b) (c)
v w v wv w
Figure 1: Orientation helps in symmetry breaking. In Figure (a) both v and w choose the same color. In
(b), for existing algorithms both remain uncolored whereas in (c), when using orientation, node v may get
colored.
time proceeds in synchronized rounds. We let n = |V |. The degree of node u is denoted du and by ∆ we
denote the maximum degree of G, i.e., ∆ = maxu∈V du. When there is no confusion, du will also be used
to refer to the number of uncolored neighbors of node u. By Nu we denote the set of neighbors of node u
and when there is no confusion, we use Nu to refer to the set of uncolored neighbors of u. We do not require
that the nodes in V have unique labels of any kind. For our algorithms to work, it is enough that each node
knows a constant factor estimate of the logarithm of the size of the network apart from its own degree and
neighbors. When we consider graphs of constant degree, no global knowledge is required for our algorithm
and it suffices that each node knows its own degree.
Given a graph G = (V, E) a vertex coloring is a mapping c : V → [C] such that if {u, v} ∈ E then
c(u) = c(v), i.e., no two adjacent vertices receive the same color. Here C denotes the number of colors used
in the coloring. We say that a coloring is a local coloring if every node u with degree du has a color in [ du]
when the coloring uses ∆ colors. The interest in local coloring arises from the fact that a local coloring has
nice implications when using the coloring in scheduling and routing problems [20].
In our model, the measure of efficiency is the number of bits exchanged. We also refer to this as the
bit complexity. We view each round of the algorithm as consisting of 1 or more bit rounds. In each bit
round each node can send/receive at most 1 bit from each of its neighbors. We assume that the rounds
of the algorithm are synchronized. The bit complexity of algorithm A is then defined as the number of bit
rounds required by algorithm A. We note that, since the nodes are synchronized, each round of the algorithm
requires as many bit rounds as the maximum number of bit rounds needed by any node in this round. In our
model, we do not count local computation performed by the nodes. This is reasonable as in our algorithms
nodes perform only simple local computation.
In our model, we assume that the edges in E have an orientation associated with them. That is, for any
two neighbors v, w exactly one of the following holds for the edge {v, w}: {v, w} is oriented either v → w
or as w → v. In the former we also call v superior to w and vice-versa in the latter. Having orientation
on the edges is a property that has not been studied in the context of vertex coloring though it is a natural
property since networks usually evolve and for every connection there is usually a node that initiated it. We
show that algorithms for symmetry breaking can be greatly improved provided that the underlying graph is
oriented. The exact way in which orientation is used for symmetry breaking is explained in Figure 1. As
shown, if nodes v and w choose the same color during any round of the algorithm, in the existing algorithms,
both nodes remain uncolored as in Figure 1(b) and have to try in a later round. With orientation, if the edge
{v, w} is oriented as v → w as shown in Figure 1(c), then node v can retain its choice provided that there is
no edge {u, v} oriented u → v and u also chooses the same color.
Even though the graphs is equipped with orientation on the edges we still allow that on any edge bits
can still flow in both the directions. Thus we consider undirected graphs but with the edge orientations.
One parameter that will be important for our investigations is the length of the shortest cycle in the
orientation. We formalize this notion in the following definition.
Definition 1.1 ( –acyclic Orientation) An orientation of the edges of a graph is said to be –acyclic if the
minimum length of any directed cycle induced by the orientation is at least . Note that this is not the girth
of the given graph.
2
4. We always assume that the input graph is provided with a
√
log n–acyclic orientation.
1.2 Related Work
The problem of vertex coloring in distributed systems has a long and rich history. It is an open problem
whether deterministic poly-logarithmic time distributed algorithms exist for the problem of (∆ + 1)-vertex
coloring [27]. The best known deterministic algorithm to date is presented in [27] and requires nO(1/
√
log n)
rounds. Following considerations known from the radio broadcasting model [1] the problem cannot be
solved at all in a deterministic round model without the use of unique identification numbers. Hence, most
of the algorithms presented are randomized algorithms.
Karp and Widgerson [16] have shown that a MIS can be found in O(log3
n) rounds w.h.p. and Luby
[22] presents algorithms to find MIS in arbitrary graphs in O(log n) round with high probability. Luby [23]
and Johansson [14] present parallel algorithms that can be interpreted as distributed algorithms that provide
a (∆ + 1)–coloring of a graph G in O(log n) rounds, with high probability. In the algorithm presented in
[23], in every round each node that is not yet colored has a probability of choosing a color which is set to
1/2. Luby’s algorithm requires only pairwise independence and a derandomization was also shown in [23]
for the PRAM (Parallel Random Access Machine) model of computation. Without such wakeup Johansson
[14] presents an algorithm for ∆ + 1 distributed coloring. Recent empirical studies [7] have shown that the
constant factors involved are small and also that a wakeup probability of 1 as in the algorithm of [14] reduces
the number of rounds required. However, the analytical reason for this behavior is not known. Algorithms
for vertex coloring are also presented in [10, 15] in the PRAM model of computation.
All of the above cited algorithms can be implemented as distributed algorithms in the message passing
model and run in poly-logarithmic rounds with bit complexity O(log n log ∆), with high probability. Cole
and Vishkin [4] and Goldberg et. al. [10] have shown that a (∆ + 1)–coloring of the cycle graph on
n nodes can be achieved in O(log∗
n) communication rounds. This was shown to be optimal in Linial
[21] by establishing that 3-coloring an n–node cycle graph cannot be achieved in less than (log∗
n − 1)/2
rounds. When unlimited local computation is available Linial [21] shows how to obtain an O(∆2) coloring
in O(log∗
n) rounds. This was later improved by De Marco and Pelc [24] to show that an O(∆) coloring
can be achieved in O(log∗
(n/∆)) rounds.
In a related work, Grable and Panconesi [12] present a distributed algorithm in the message passing
model for edge coloring that runs in O((1 + α−1) log log n) rounds provided that the degree of any node in
the graph is Ω(nα/ log log n) for any α > 0. Our analysis of Phase I for arbitrary graphs follows the analysis
of Phase II in [12].
Distributed algorithms with the underlying graph equipped with sense of direction have been studied in
[28, 8]. Sense of direction is a similar notion to that of orientation on edges. Singh [28] shows that leader
election in an n-node complete graph equipped with sense of direction can be performed in a distributed
setting via exchange of O(n) messages. In [8], the authors show that having sense of direction reduces the
communication complexity of several distributed graph algorithms such as leader election, spanning tree
construction, and depth-first traversal.
1.3 Our Results
We start by investigating the bit complexity of distributed vertex coloring algorithms. We first show that
the bit complexity of the coloring problem is Ω(log n) for an non-oriented n-node cycle graph. That is, any
distributed algorithm in which all the nodes start in the same state and know only about n and ∆ apart from
their neighbors needs Ω(log n) rounds with high probability to arrive at a proper coloring using any finite
number of colors. We then show that when the edges in the cycle graph are provided with an orientation,
then the bit complexity of distributed vertex coloring algorithms is Ω(
√
log n), with high probability, when
3
5. using any finite number of colors. This leads us to the question whether matching upper bounds can be
shown for coloring oriented graphs.
We start with the case of constant degree
√
log n–acyclic oriented graphs and present an algorithm to
obtain a (∆ + 1)–coloring with a bit complexity of O(
√
log n) with high probability. Thus, we show the
following theorem.
Theorem 1.2 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆, if ∆ is a constant,
a (∆ + 1)–vertex coloring of G can be obtained in O(
√
log n) bit rounds, with high probability.
The above theorem directly implies that oriented cycle graphs can be 3–colored in O(
√
log n) bit rounds.
Additionally, for the case of constant degree graphs we can also arrive at a local coloring where the color of
every node u is in [du + 1]. In our algorithm for constant degree oriented graphs, it suffices that nodes know
only their local degree.
We then extend our algorithm and analysis to the case of arbitrary
√
log n–acyclic oriented graphs
with maximum degree ∆. Our main result is a distributed (1 + )∆–coloring algorithm for arbitrary√
log n–acyclic oriented graphs of maximum degree ∆. Our algorithm has a bit complexity of O(log ∆) +
˜O(
√
log n). By g(n) = ˜O(f(n)) we mean g(n) = O(f(n)polylog(f(n))). Specifically, we prove the
following theorem.
Theorem 1.3 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆, a (1 + )∆–
vertex coloring of G, for any constant > 0, can be obtained in O(log ∆) + ˜O(
√
log n) bit rounds, with
high probability.
By further tightening the analysis, we show that the bit complexity can be reduced to O(log ∆ +√
log n log log n), with high probability, for
√
log n–acyclic oriented graphs with ∆ ≥ log n.
For the case of arbitrary
√
log n–acyclic oriented graphs, our algorithm and analysis can be modified
easily to get a local coloring such that every node u gets a color in [(1 + )du].
1.4 Summary of our approach
We now provide a brief summary of our basic approach. Our approach has the same flavor as existing
distributed vertex coloring algorithms [23, 14]. Given any
√
log n–acyclic oriented graph G = (V, E)
of constant degree ∆, the algorithm for (∆ + 1)–coloring proceeds as follows. Communication proceeds
in rounds and in each round each yet uncolored node v chooses a color cv among the available colors in
[∆ + 1] independently and uniformly at random. Node v then communicates this color choice to all of its
uncolored neighbors. If a node chooses a color that is in conflict with any of the choices of its neighbors,
the conflict resolution rule specifies the course of action. In the algorithm of Luby[23], Johansson[14], and
most other works, the conflict resolution rule is that uncolored nodes in conflict remain uncolored and have
to try again in subsequent rounds. The conflict resolution rule we use is based on the orientation on the
edges as explained in Section 1.1. Our algorithm is thus similar to the existing distributed vertex coloring
algorithms [23, 14] except for the conflict resolution rule.
In our analysis, after O(
√
log n) rounds we arrive at the situation where connected components of un-
colored nodes only have simple oriented paths of length less than
√
log n, with high probability. Coupled
with the
√
log n–acyclic orientation, it can be shown that the nodes in each such connected component can
be organized into less than
√
log n layers. The layering has the property that all the oriented edges are
from a node in a lower-numbered layer to a node in a higher numbered layer. This property of the layering
guarantees a successful coloring of all remaining uncolored nodes in less than
√
log n rounds. This gives us
the result for constant degree oriented graphs. (Theorem 1.2). To arrive at the bit complexity for arbitrary
graphs, (Theorem 1.3) we need a few additional tricks as our analysis shows.
4
6. 1.5 Organization of the Paper
The rest of the paper is organized as follows. In Section 2 we establish the lower bound results. In Section
3, we present and analyze our algorithm for (∆ + 1)–coloring constant degree oriented graphs. This will
serve as a base for the (1 + )∆–coloring algorithm for arbitrary oriented graphs of maximum degree ∆ for
any constant > 0, in Section 4.
2 Lower Bounds
In this section we establish lower bounds on the bit complexity of finding a proper vertex coloring. Recall
that a Las Vegas algorithm is a randomized algorithm that always produces a correct result, with the only
variation being its runtime. First, we prove a lower bound for non-oriented graphs, and then we prove a
lower bound for oriented graphs. Notice that both bounds hold for any finite number of colors.
Theorem 2.1 For every Las Vegas algorithm A there is an infinite family of non-oriented graphs G s.t. A
has a bit complexity of at least Ω(log n) on G, with high probability, to compute a proper vertex coloring.
Proof. Consider the cycle of n nodes, and let S = (u , . . . , u1, v1, . . . , v ) be the set of nodes along a
path of length 2 of the cycle. Initially, every node in S is in the same state s0, with the only difference
that for every i ∈ {1, . . . , − 1}, ui considers its left connection to go to ui+1 whereas vi considers its
left connection to go to vi+1. (Notice that the cycle is non-oriented, so we can choose any orientation we
want for the individual nodes.) Associated with s0 is a fixed probability distribution P = (px)x∈{−,0,1}
for sending bit x along the right edge, where “−” represents the case that no bit is sent and represents the
empty history. Since P has only three probability values, there must be an x0 with px0
≥ 1/3. Let E1 be
the event that nodes u1 and v1 choose that option. Then u1 and v1 receive the same information from their
right neighbor. Let Py = (py
x)x∈{−,0,1} be the probability distribution for sending bit x along the right edge
in the second round given that bit y was received from the right edge in the first round. Then Px0 applies
to u1 and v1. Since Px0 has only three probability values, there must be an x1 with px0
x1
≥ 1/3. Let E2 be
the event that nodes u1 and v1 choose that option. Then u1 and v1 again receive the same information from
their right neighbor.
Continuing with this argumentation, it follows that there are events E1, . . . , E with Ei having a proba-
bility of at least 1/3 for all i so that u1 and v1 have received the same information from their right neighbors.
Algorithm A cannot terminate in this case because in this case the same probability distribution for choosing
a color applies to u1 and v1, and hence, the probability that u1 and v1 choose the same color is non-zero.
When choosing = log3(n/2 log2
n), the probability for the events E1, . . . , E to occur is at least
1
3
log3(n/2 log2
n)
= 2 log2
n
n . Moreover, notice that E1, . . . , E only depend on the nodes in S because
information can only travel a distance of edges in rounds. Hence, we can partition the n-node cycle into
n/2 many sequences S where each sequence has a probability of at least 2 log2
n
n of running into the events
E1, . . . , E that is independent of the other sequences. Thus, the probability that all node sequences can
avoid the event sequence E1, . . . , E , which is necessary for A to terminate, is at most 1 − 2 log2
n
n
n/2
≤
1/n, which implies that A needs Ω(log n) bit-rounds, with high probability, to finish.
Theorem 2.2 For every Las Vegas algorithm A there is an infinite family of oriented graphs G s.t. A has a
bit complexity of at least Ω(
√
log n) on G, with high probability, to compute a proper vertex coloring.
Proof. Consider the cycle of n nodes in which all the edges are oriented in the same direction. Let
S = (u , . . . , u1, v1, . . . , v ) be the set of nodes along a path of length 2 of the cycle. Initially, every node
in S is in the same state s0. Associated with s0 is a fixed probability distribution P0 = (p0
x,y)x,y∈{−,0,1} for
5
7. sending bit x along the left edge and bit y along the right edge, where “−” represents the case that no bit is
sent. Since P0 has only nine probability values, there must be an x0 and y0 with p0
x0,y0
≥ 1/9. Let E1 be the
event that all nodes in S choose that option. Then all nodes in S −1 = (u −1, . . . , u1, v1, . . . , v −1) receive
the same information and must therefore be in the same state s1. Associated with s1 is a fixed probability
distribution P1 = (p1
x,y)x,y∈{−,0,1} for sending bit x along the left edge and bit y along the right edge. Since
P1 has only nine probability values, there must be an x1 and y1 with p1
x1,y1
≥ 1/9. Let E2 be the event
that all nodes in S −1 choose that option. Then all nodes in S −2 receive the same information and must
therefore be in the same state s2.
Continuing with this argumentation, it follows that there are events E1, . . . , E with Ei having a proba-
bility of at least (1/9)2( −i+1) for all i so that all nodes in S −i are in the same state si. Since these nodes
are neighbors, algorithm A cannot terminate within bit exchanges if E1, . . . , E are true because whatever
probability distribution A chooses on the colors, the probability that two neighboring nodes choose the same
color is non-zero, which would violate the assumption that A is a Las Vegas algorithm.
The probability that E1, . . . , E are true is at least 1
9
P
i=1 2( −i+1)
≥ 1
9
2/2
and when choosing =
2 log9(n/2 log2
n), this results in a probability of at least (2 log2
n)/n. Moreover, notice that E1, . . . , E
only depend on the nodes in S because information can only travel a distance of edges in rounds. Hence,
we can partition the n-node cycle into n/2 many sequences S where each sequence has a probability of at
least 2 log2
n
n of running into the events E1, . . . , E that is independent of the other sequences. Hence, the
probability that all node sequences can avoid the event sequence E1, . . . , E , which is necessary for A to
terminate, is at most 1 − 2 log2
n
n
n/2
≤ 1/n , which implies that A needs Ω(
√
log n) bit-rounds, with
high probability, to finish.
Thus, oriented graphs appear to be easier to color than non-oriented graphs. In the next section we show
that this is indeed the case by providing a matching upper bound for constant-degree graphs.
3 Upper Bound for Constant Degree Oriented Graphs
In this section we present and analyze the algorithm for (∆ + 1)–coloring constant degree oriented graphs.
This demonstrates the efficacy of using orientation in vertex coloring algorithms. We defer the case of
arbitrary oriented graphs to Section 4 as it requires more complicated arguments than for constant degree
graphs.
The algorithm for vertex coloring constant degree oriented graphs is given in Figure 2. In the algorithm,
the parameter Cu refers to the number of colors used in the coloring by node u. Each node executes the
algorithm Color-Random until it gets colored.
We analyze algorithm Color-Random for constant degree oriented graphs with a
√
log n–acyclic orienta-
tion and show that algorithm Color-Random can be used to obtain a (∆ + 1)–coloring with a bit complexity
of O(
√
log n). The reduction in the bit complexity from Ω(log n) (due to Theorem 2.1) to O(
√
log n)
comes from the fact that once every simple oriented path of length
√
log n has at least one colored node, the√
log n–acyclic orientation guarantees us connected components of uncolored nodes where each such com-
ponent only has simple oriented paths of length less than
√
log n. The
√
log n-acyclicity of the orientation
allows us to finish in a further
√
log n rounds.
Theorem 3.1 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆, if ∆ is a constant,
a (∆ + 1)–vertex coloring of G can be obtained in O(
√
log n) bit rounds, with high probability.
Proof. The analysis below cuts the time into two phases. Phase I ends once every simple oriented path of
length =
√
log n has at least one colored node, and phase II ends once all nodes are colored. We show
6
8. Algorithm Color-Random(Cu)
While u is not colored do
1. Node u chooses a color cu from the available colors in [Cu] uniformly at random.
2. Node u communicates its choice cu, from step 1, to all of its uncolored neighbors that have a lower
priority over u, i.e. to nodes v such that u → v.
3. If node u does not receive a message from any of its neighbors w with w → u and cw = cu, then
node u gets colored with color cu. Otherwise node u remains uncolored.
4. If u is colored during step 3 of the current round, then u informs all of its uncolored neighbors about
the color of u.
5. Node u updates the list of available colors according to colors taken up by u’s neighbors.
Figure 2: Coloring constant degree oriented graphs by random choices.
that phase I takes at most r = 4
√
log n rounds, with high probability. For Phase II, the proof uses the√
log n–acyclic orientation to argue that a further
√
log n rounds suffice to color all nodes. For simplicity,
we set Cu = 2∆ for every node u, but the analysis works, with minor modifications, for Cu = ∆ + 1, as
long as ∆ is a constant.
Consider any simple oriented path P of length . For any node u ∈ P with Cu remaining colors and du
remaining uncolored neighbors, the probability that it chooses a color that is identical to the choice of any
of its uncolored neighbors is at most
du
j=1 1/Cu ≤ du/(2∆ − (du − du)) ≤ 1/2 as Cu = 2∆ − (du − du)
and du ≤ du.
For any i ≥ 1, denote by EP,i the event that all nodes in P have a color conflict in round i. Since each
node chooses the color independently and uniformly at random, and P is oriented, one can identify a distinct
witness for each color conflict so as to upper bound Pr[EP,i | ∩i−1
j=0EP,j] as Pr[EP,i | ∩i−1
j=0EP,j] ≤ (1/2) .
Denote by EP the event that the event EP,i occurs for r consecutive rounds. Then,
Pr[EP ] = Pr[
r
i=1
EP,i] = Πr
i=1 Pr[EP,i | ∩i−1
j=1EP,j] ≤ (1/2) r
.
Let E denote the event that for some simple oriented path P the event EP occurs. The number of simple
oriented paths of length is at most n∆ by choosing the first vertex from n available choices and choosing
each of the next vertices from the at most ∆ available choices. Thus,
Pr[E] = Pr[
P
EP ] ≤ n∆ Pr[EP ] ≤ 1/n2
.
for the above value of r since ∆ = O(1). This completes Phase I of the analysis.
Consider connected components of uncolored nodes. At the end of Phase I, since any simple oriented
path of length has at least one colored node, each such component only has simple oriented paths of length
less than , with high probability. Moreover, the input graph does not have oriented cycles of length less
than
√
log n which implies that each such component can be organized into less than
√
log n layers with
oriented edges going only from a node in a lower-numbered layer to a node in a higher numbered layer. This
layering can be achieved by the following process. Nodes with no superiors are assigned to layer 0. After
removing these nodes, nodes in the rest of the component with no superiors are assigned to layer 1, and so
on, until there are no nodes left. Such a procedure terminates in less than
√
log n rounds, implying that the
layer number of any node is less than
√
log n. Otherwise, there must exist either a simple oriented path of
length at least
√
log n or an oriented cycle of length less than
√
log n. Both of these conditions result in a
contradiction and hence the layering process must terminate in less than
√
log n rounds. Figure 3 shows an
example along with the assignment of nodes to layers.
7
9. Colored node
Uncolored node
Legend:
11
2
1
2
3
3
3
3
Figure 3: Connected component of uncolored nodes. The number at the uncolored nodes within the con-
nected component gives the layer number they belong to.
Now, in Phase II, during every round the uncolored nodes assigned to the lowest layer number presently
get colored as the nodes assigned to the lowest layer can always retain their color choice from Step 1. This
implies that Phase II can finish in less than
√
log n rounds.
Since in each round each uncolored node has to exchange O(log ∆) = O(1) bits, the bit complexity of
the algorithm Color-Random is O(
√
log n).
We note that the same proof also holds for 3–coloring cycle graphs, with any orientation, with minimal
changes. Coupled with the lower bound result in Theorem 2.2, our analysis for the case of constant degree
graphs is tight with respect to the bit complexity, up to constant factors. The algorithm and the analysis can
be modified easily to achieve a local coloring also.
4 Upper Bound for Arbitrary Oriented Graphs
In this section we describe and analyze our algorithm for vertex coloring an arbitrary
√
log n–acyclic ori-
ented graph G using (1 + )∆ colors for any constant > 0.
Our algorithm and the analysis in this case requires more tools than that for constant degree graphs
while having the same flavor. Theorem 3.1 fails to hold once the degree of the input graph is bounded away
from any constant. Graphs below logarithmic degree, but bounded away from a constant, pose additional
problems as graphs with degree below a certain threshold are not easily amenable to nice probabilistic
bounds. In many papers, for example [12, 26, 5], this problem was overcome by assuming that the number
of colors available is max{(1 + )∆, log n} so that sub-logarithmic degree graphs are colored with log n
colors. We instead take the approach of coloring with (1 + )∆ colors as coloring with few colors is more
appealing when vertex coloring is used as a sub-routine in other higher order tasks.
To arrive at our result, we proceed in stages. Based on techniques from [12], we first show how to arrive
at a bit complexity of ˜O(log ∆ +
√
log n). Later, using advanced techniques, we show how to arrive at a bit
complexity of O(log ∆) + ˜O(
√
log n). Finally, for graphs with ∆ ≥ log n, we show how to arrive at a bit
complexity of O(log ∆ +
√
log n log log n).
Our algorithm for any node u is presented in Figure 4. The parameter Cu denotes the number of colors
each vertex u can choose from. Each node runs the algorithm in Figure 4 while it remains uncolored.
We now provide a summary of our analysis of algorithm Color. Our analysis cuts time into two phases.
In the first phase we show that for any vertex the number of uncolored neighbors reduces to at most c2 log n
for a constant c2, in O(log log n) rounds, with high probability. In the second phase we first show that the
graph can be decomposed into connected components of uncolored nodes such that each such connected
8
10. Algorithm Color(Cu)
Phase I
1. Set Cu := c1∆ for a constant c1 ≥ 3.
2. While du ≥ c2 log n for a constant c2 do
3. Use Algorithm Color-Random(Cu).
Phase II
4. Set Cu := min{2c2 log n, 2du}.
5. Use Algorithm Color-Random(Cu).
Figure 4: Algorithm for any node u.
component only has simple oriented paths of length less than
√
log n, with high probability. The analysis
then proceeds to show that all the nodes can be colored in a further
√
log n rounds.
In the algorithm and the analysis we also set Cu = c1∆ for a constant c1 ≥ 3, for every node u, for the
sake of simplicity. Using techniques from [12], it is possible to extend the following analysis to use only
(1 + )∆ colors, for any constant > 0.
4.1 Analysis for Phase I
In this phase, we show that the number of uncolored neighbors of any node u reduces in a double-exponential
fashion, (i.e., in O(log log n) rounds) to c2 log n. This analysis has strong connections to occupancy prob-
lems [25, Problem 3.4],[2], and the edge coloring algorithm of [12].
Let du(i), Nu(i), Cu(i) refer to the number of uncolored neighbors, the set of uncolored neighbors, and
the size of the color palette of node u respectively, at the beginning of round i. Also, let ˆd(i) = maxu du(i).
Lemma 4.1 If du(1) ≥ c2 log n then du(c log log n) ≤ c2 log n, with high probability for some constant
c ≥ 1.
Proof. The intuition behind the proof is that at the end of every round, the number of remaining uncolored
neighbors decreases double-exponentially.
During round i the probability that an uncolored node u fails to get colored can be computed as: Pu(i) :=
Pr[u does not get colored during round i] ≤
du(i)
j=1
1
Cu(i) ≤
ˆd(i)
α∆ , as, for c1 sufficiently large, it holds that
Cu(i) ≥ α∆ for α = c1 − 1.
The expected number of neighbors of u that are still uncolored after round i is, E[du(i + 1)] =
v∈Nu(i) Pv(i) ≤ ˆd(i)2/α∆.
Consider the following recurrence relation between ˆd(i + 1) and ˆd(i) for a constant c .
ˆd(i + 1) ≤
ˆd2(i)
α∆
+ c ˆd(i) log n. (1)
Using a large deviation bound [11, 12], it can be shown that du(i + 1) exceeds its expected value by
more than c ˆd(i) log n with probability less than n−2 for some constant c . Thus, it holds that du(i+1) ≤
ˆd(i + 1) w.h.p., for all nodes u.
The recurrence relation in Equation (1) can be solved as follows (cf. [12]). While ˆd2(i)/α∆ dominates
the second term, we can write:
ˆd(i + 1) ≤ 2 ˆd2
(i)/α∆ (2)
9
11. and find a value of r1 such that ˆd2(i)/α∆ ≤ 2 c ˆd(i) log n. For this we require that ˆd3(r1) ≤ 4c α2∆2 log n.
Using Equation (2), it follows that
ˆd(r1) ≤ (2/α)2r1−1
∆
Hence, we require that:
(2/α)2r1−1
∆
3
= 4α2
∆2
c log n
which results in a value of r1 = O(log log n).
From this point on, it holds that
ˆd(i + 1) ≤ 8 c ˆd(i) log n (3)
Solving Equation (3) for a value of r2 so that ˆd(r2) ≤ c2 log n results in r2 = O(log log n).
Thus after i∗ = r1 + r2 = O(log log n) rounds, we have for any node u, du(i∗) ≤ ˆd(i∗) ≤ c2 log n.
Thus, at the end of O(log log n) rounds of the algorithm, the number of uncolored neighbors for every
node is at most c2 log n. This completes Phase I of the analysis.
4.2 Analysis for Phase II
The analysis in this phase consists of two sub-phases. In sub-phase II(a), we argue that along any simple
oriented path of length
√
log n there exists at least one colored node, with high probability. In the second
sub-phase we show that all the remaining uncolored nodes successfully get colored within
√
log n rounds.
Notice that since the number of uncolored neighbors of any node at the beginning of this phase is at most
c2 log n, nodes can use a color palette of size min{2c2 log n, 2du} for this phase as shown in the algorithm
in Figure 4.
4.2.1 Analysis for Phase II(a)
We now establish the following lemma which shows that every simple oriented path of length
√
log n has at
least one colored node after O(
√
log n) rounds, with high probability. Let ∆∗ denote the maximum number
of uncolored neighbors for any node u. After phase I, it holds that ∆∗ ≤ c2 log n.
Lemma 4.2 For arbitrary
√
log n–acyclic oriented graphs G at the end of O(
√
log n) rounds, any simple
oriented path of length =
√
log n will have at least one colored node, with high probability. Further, the
bit complexity of this phase is O(
√
log n log log n).
Proof. The proof of this lemma is similar to the proof of Phase I of Theorem 3.1. Consider any simple
oriented path P of length =
√
log n. Let EP,i denote the event that all the nodes in P are in a color conflict
during a given round i. Then, along the lines of Phase I of Theorem 3.1, it holds that Pr[EP,i | ∩i−1
j=1EP,j] ≤
(1/2) . Define EP to be the event that the event EP,i occurs for r = 4
√
log n consecutive rounds. Then, it
holds that
Pr[EP ] = Pr[∩r
i=1EP,i] = Πr
i=1 Pr[EP,i | ∩i−1
j=1EP,j] ≤ (1/2) r
.
Let E denote the event that there exists a path P for which the event EP occurs. Since the number of
simple oriented paths of length P is at most n∆∗,
Pr[E] ≤
P
Pr[EP ] ≤ n∆∗
1
2
r
.
10
12. As r = 4
√
log n and =
√
log n and ∆∗ ≤ c2 log n, the above probability is polynomially small.
The bit complexity of this phase is O(
√
log n log log n) as in each round each uncolored exchanges
O(log log n) bits.
4.2.2 Analysis for Phase II(b)
Consider connected components of uncolored nodes. At the end of Phase II(a), since any simple oriented
path of length
√
log n has at least one colored node, w.h.p., it holds that each such component has only
simple oriented paths of length less than
√
log n with high probability. Also, the input graph G does not
have oriented cycles of length less than
√
log n. For Phase II(b), we show the following lemma.
Lemma 4.3 In Phase II(b), after less than
√
log n rounds, all nodes in G are colored properly. Further, the
bit complexity of Phase II(b) is O(
√
log n log log n).
Proof. The proof of this lemma is similar to that of the proof of Phase II in Theorem 3.1. It can be shown
that any connected component of uncolored nodes only has simple oriented paths of length less than , with
high probability. Moreover, the input graph does not have oriented cycles of length less than
√
log n which
implies that each such component can be organized into less than
√
log n layers with the oriented edges
going only from a node in a lower-numbered layer to a node in a higher numbered layer.
Using the layering, then it can be shown that during each further round, at least one node gets colored
successfully. Thus, this phase requires less than
√
log n rounds and each node exchanges O(log log n) bits
during every round of this phase. Thus the bit complexity of this phase is O(
√
log n log log n).
From the above discussion, the following theorem holds.
Theorem 4.4 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆, for any constant
> 0, a (1 + )∆–vertex coloring of G can be obtained in ˜O(log ∆ +
√
log n) bit rounds, with high
probability.
Proof. Phase I has a bit complexity of O(log ∆ log log n) as for O(log log n) rounds, each node exchanges
O(log ∆) bits. For Phase II, the bit complexity is O(
√
log n log log n). Adding the bit complexity of both
the phases, we arrive at the theorem.
4.3 Further Improvements
In this section, we show that the bit complexity of Phase I can be reduced to O(log ∆ + log log n) thereby
reducing the bit complexity of the algorithm for arbitrary
√
log n–acyclic oriented graphs to O(log ∆) +
˜O(
√
log n). We then show a tighter analysis for high degree graphs to arrive at a bit complexity of O(log ∆+√
log n log log n). By high degree graphs, we mean graphs with ∆ ≥ log n.
4.3.1 Improvements to the Analysis of Phase I
The tightness of the analysis stems from a gradual reduction in the number of colors during Phase I. This
results in savings in the bit complexity of Phase I. For this purpose, phase I is divided into sub-phases as
follows. For j ≥ 1, sub-phase j starts when the number of uncolored neighbors of u is at most D
(j)
u , where
D
(j)
u acts a threshold on the number of uncolored neighbors of node u. D
(j)
u is defined as D
(1)
u = du and
D
(j)
u = D
(j−1)
u for j ≥ 2. Let C
(j)
u denote the size of the color palette of node u during sub-phase j. At
the beginning of sub-phase j node u reduces the size of its color palette so that C
(j)
u = c1 C
(j−1)
u with
C
(1)
u = c1∆.
11
13. Algorithm PhaseI
1. Du :=
√
du, Cu := c1∆.
2. While du ≥ c2 log n do
3. Run Algorithm Color–Random(Cu).
4. If du ≤ Du then
5. Du :=
√
Du, Cu := c1
√
Cu
end-while.
Figure 5: Improved algorithm for Phase I.
This effectively reduces the number of bits required to be sent in each sub-phase by a factor of 2 but the
proof of Lemma 4.1 holds with minimal changes. Thus, in Phase I, it can be seen that over the O(log log n)
sub-phases the number of bits each node u sends is at most
O(log log n)
j=1
log C(j)
u ≤
O(log log n)
j=1
2 log c1 + ((log ∆)/2j
) = O(log log n + log ∆).
Thus, the bit complexity for Phase I reduces to O(log log n + log ∆).
The modified algorithm for Phase I for node u is described in Figure 5. Using the tighter analysis for
Phase I and Lemmata 4.2–4.3, we arrive at the following theorem.
Theorem 4.5 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆, a (1 + )∆–
vertex coloring of G for any constant > 0, can be obtained in O(log ∆) + ˜O(
√
log n) bit rounds, with
high probability.
4.3.2 Improvements to Phase II
For the case of high degree graphs, we now show how to reduce the bit complexity of Phase II to O(
√
log n log log n).
The algorithm for Phase II remains the same as shown in Figure 4. The analysis of Phase II now consists
of 3 sub-phases. In sub-phase II(a), we show that the number of uncolored neighbors of any node decreases
to O(
√
log n log log n) after O(
√
log n/ log log n) rounds with high probability. In sub-phase II(b) we then
show that every simple oriented path of length log n/ log log n has at least one colored node, with high
probability, after O( log n/ log log n) rounds. In the final sub-phase, we show that every node can be
colored in a further O( log n/ log log n) rounds. In this phase, every node can use a color palette of size
2c2 log n.
Analysis for Phase II(a)
For sub-phase II(a), we show the following lemma.
Lemma 4.6 In Phase II(a), in O(
√
log n
log log n ) rounds, the number of uncolored neighbors of any node decreases
to
√
log n log log n, with high probability. Further, the bit complexity of this sub-phase is O(
√
log n).
Proof. Consider any node u. At the end of phase I, it holds that du ≤ c2 log n, with high probability. Since
the number of colors used by u is 2c2 log n, it also holds that
Pr[node u fails to get colored in a given round] ≤ 1/2.
12
14. Consider any subset A of the uncolored neighbors of u. Let EA denote the event that all the nodes in A
remain uncolored after r = 4
√
log n
log log n consecutive rounds. Then, it holds that
Pr[EA] ≤ (1/2)r|A|(1−1/
√
log n)
using the orientation and the witnessing scheme of Theorem 3.1. Notice that as we are guaranteed of√
log n–acyclicity we can find a set of at least |A|(1 − 1/
√
log n) nodes in conflict with a distinct witness
for each color conflict.
Let Eu,s denote the event that for node u, there exists a set of s uncolored neighbors at the end of r
rounds. Then,
Pr[Eu,s] = Pr[
A⊆Nu,|A|=s
EA] ≤
A⊆Nu,|A|=s
Pr[EA] =
du
s
1
2
rs(1−1/
√
log n)
.
Denote by Eu the event that for node u there exist more than
√
log n log log n uncolored neighbors.
Using Boole’s inequality,
Pr[Eu] ≤
du
s=
√
log n log log n
Pr[Eu,s] ≤
du
s=
√
log n log log n
du
s
·(1/2)rs(1−1/
√
log n)
≤ 2du
·(1/2)3 log n
≤
1
n2
.
as rs ≥ 4 log n. Now, denote by E the event that for some node u, the event Eu occurs. Then, Pr[E] =
Pr[ u∈V Eu] ≤ 1/n. Thus, the number of uncolored neighbors of any node decreases to
√
log n log log n
with high probability after 4
√
log n/ log log n rounds.
During this phase, each uncolored node exchanges O(log log n) bits in each round as the palette size is
2c2 log n. Thus, the bit complexity of this sub-phase is O(
√
log n).
Analysis for Phase II(b)
At the end of sub-phase II(a), it holds that the number of uncolored neighbors of any node u is at most√
log n log log n. Recall that ∆∗ = maxu du. After sub-phase II(a), it holds that ∆∗ ≤
√
log n log log n.
Lemma 4.7 In Phase II(b), in 16 log n/ log log n rounds, in every simple oriented path of length log n/ log log n
there is at least one node that gets colored, with high probability. Further, the bit complexity of this sub-
phase is O(
√
log n log log n).
Proof. Consider any simple oriented path P of uncolored nodes of length = log n/ log log n. Denote
by EP the event that no node in P gets colored in 16 log n/ log log n rounds. Since P is oriented and the
choices of each node are independent, using a witnessing scheme similar to that in the proof of Theorem
3.1, it holds that:
Pr[EP ] ≤
√
log n log log n
2c2 log n
·16
√
log n/ log log n
≤
log log n
2c2
√
log n
16 log n
log log n
≤
1
n4
,
if n is sufficiently large. In the above, the first inequality holds since the number of uncolored neighbors is√
log n log log n and the number of colors that u can choose from is 2c2 log n.
Let E denote the event that there exists a simple oriented path P of length such that for path P, the
event EP occurs. The number of simple oriented paths of length = log n/ log log n is at most n · ∆∗.
Thus,
13
15. Pr[E] = Pr[
P
EP ] ≤
n∆∗
j=1
1/n4
≤ ∆∗/n3
.
The above probability is polynomially small since ∆∗ ≤
√
log n log log n. Thus, along any simple
oriented path of length log n/ log log n, at least one node gets colored with high probability at the end of
16 log n/ log log n rounds.
The bit complexity of this sub-phase is easily seen to be O(
√
log n log log n) as in each round, each
uncolored node exchanges O(log log n) bits.
This completes the analysis for Phase II(b). In Phase II(c), using arguments similar to that of Lemma 4.3,
it can be shown that in a further log n/ log log n rounds, every node gets colored, with high probability.
The bit complexity of Phase II(c) is O(
√
log n log log n). Putting together everything, we arrive at the
following theorem.
Theorem 4.8 Given a
√
log n–acyclic oriented graph G = (V, E) of maximum degree ∆ ≥ log n, for any
constant > 0, a (1+ )∆–vertex coloring of G can be obtained in O(log ∆+
√
log n log log n) bit rounds,
with high probability.
Notice that for Theorem 4.8 to hold, the input graph only needs to be log n/ log log n–acyclic, but we
stated the theorem with the
√
log n–acyclicity assumption for the sake of consistency.
The following corollary can be easily obtained showing that for the case of dense
√
log n–acyclic ori-
ented graphs, our result on the bit complexity is close to the worst-case optimal.
Corollary 4.9 Given an arbitrary
√
log n–acyclic oriented graph G = (V, E) with ∆ = Ω(2
√
log n log log n),
for any > 0, a (1 + )∆-vertex coloring can be obtained in O(log ∆) bit rounds, with high probability.
5 Conclusions
We presented algorithms for distributed vertex coloring using a simple and natural model. While our results
are tight in general, a related question to ask is whether any further conditions on the orientation would
result in better bounds or whether certain orientations outperform other orientations. For example, if the
orientation or the graph is known to be acyclic, would it be possible to color in fewer bit rounds?
References
[1] L. Barriere, P. Flocchini, P. Fraigniaud, and N. Santoro. Can we elect if we cannot compare? In Proc.
of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 324–332, 2003.
[2] M. Bender, M. Farach-Colton, S. He, B. Kuszmaul, and C. Leiserson. Adversarial contention resolu-
tion for simple channels. In Proc. of ACM Symposium on Parallelism in Algorithms and Architectures
(SPAA), pages 325–332, 2005.
[3] M. Choy and A. K. Singh. Efficient fault tolerant algorithms for resource allocation in distributed
systems. In Proc. of the ACM Symposium on Theory of Computing (STOC), pages 169–177, 1992.
[4] R. Cole and U. Vishkin. Deterministic coin tossing and accelerating cascades: micro and macro
techniques for designing parallel algorithms. In Proc. of the ACM Symposium on Theory of Computing
(STOC), pages 206–219, 1986.
14
16. [5] D. P. Dubhashi and A. Panconesi. Near-optimal distributed edge coloring. In Proc. of the European
Symposium on Algorithms (ESA), pages 448–459, 1995.
[6] U. Feige and J. Kilian. Zero-knowledge and chromatic number. In Proc. of the Annual Conference on
Computational Complexity, 1996.
[7] I. Finocchi, A. Panconesi, and R. Silvestri. Experimental analysis of simple, distributed vertex coloring
algorithms. In Proc. of the ACM Symposium on Discrete Algorithms (SODA), pages 606–615, 2002.
[8] P. Flocchini, B. Mans, and N. Santoro. On the impact of sense of direction on message complexity.
Information Processing Letters, 63(1):23–31, 1997.
[9] M. R. Garey and D. S. Johnson. Computers and Intractability: A guide to the theory of NP-
completeness. W.H. Freeman, 1979.
[10] A. Goldberg, S. Plotkin, and G. Shannon. Parallel symmetry breaking in sparse graphs. SIAM J.
Discrete Math., 1:434–446, 1989.
[11] D. A. Grable. A large deviation inequality for functions of independent, multi-way choices. Comb.
Probab. Comput., 7(1), 1998.
[12] D. A. Grable and A. Panconesi. Nearly optimal distributed edge colouring in O(log log n) rounds.
Random Structures and Algorithms, 10(3):385–405, 1997.
[13] C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed diffusion: A scalable and robust commu-
nication paradigm for sensor networks. In Proc. of the 6th ACM/IEEE Mobicom Conference, pages
56–67, 2000.
[14] O. Johansson. Simpled distributed ∆ + 1 coloring of graphs. Information Processing Letters, 70:229–
232, 1999.
[15] M. Karchmer and J. Naor. A fast parallel algorithm to color a graph with ∆ colors. J. Algorithms,
9:83–91, 1988.
[16] R. Karp and A. Widgerson. A fast parallel algorithm for the maximal independent set problem. Journal
of the Association for Computing Machinery, 32(4):762–773, 1985.
[17] K. Kothapalli, M. Onus, A. Richa, and C. Scheideler. Constant density spanners for wireless ad-hoc
networks. In Proc. of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA),
pages 116–125, 2005.
[18] K. Kothapalli, M. Onus, C. Scheideler, and C. Schindelhauer. Distributed coloring in ˜O(
√
log n)–bits.
In Proc. of IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2006.
[19] K. Kothapalli and C. Scheideler. Information gathering in adversarial systems: Lines and cycles. In
Proc. of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 333–342,
2003.
[20] V. Kumar, M. Marathe, S. Parthasarathy, and A. Srinivasan. End-to-end packet-scheduling in wireless
ad-hoc networks. In Proc. of ACM Symposium on Discrete Algorithms (SODA), pages 1021–1030,
2004.
[21] N. Linial. Locality in distributed graph algorithms. SIAM Journal of Computing, 21:193–201, 1992.
15
17. [22] M. Luby. A simple parallel algorithm for the maximal independent set problem. In Proc. of the 17th
ACM Symposium on Theory of Computing (STOC), pages 1–10, 1985.
[23] M. Luby. Removing randomness in parallel without processor penalty. Journal of Computer and
System Sciences, 47(2):250–286, 1993.
[24] G. De Marco and A. Pelc. Fast distributed graph coloring with O(∆) colors. In Proc. of ACM Sympo-
sium on Discrete Algorithms (SODA), pages 630–635, 2001.
[25] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
[26] A. Panconesi and A. Srinivasan. Fast randomized algorithms for distributed edge coloring. In Proc. of
the ACM Symposium on Principles of Distributed Computing (PODC), pages 251–262, 1992.
[27] A. Panconesi and A. Srinivasan. On the complexity of distributed network decomposition. J. Algo-
rithms, 20(2):356–374, 1996.
[28] G. Singh. Efficient leader election using sense of direction. Distributed Computing, 10(3):159–165,
1997.
16