An improved spfa algorithm for single source shortest path problem using forw...
poster
1. DFA Minimization in Map-Reduce
Gösta Grahne, Shahab Harrafi, Iraj Hedayati, Ali Moallemi
Department of Computer Science and Software Engineering, Concordia University
Introduction
DFA Minimization is the process of finding equivalent minimal DFA
DFA A = (Q, Σ, δ, qs, F)
DFA Minimization DFA minimization is the process of discovering an
equivalent DFA to given one with minimum number of states.
Hopcroft Hopcroft’s algorithm is considered superior due to its
running time of O(n log n)
q2
2
p1
1
p3
1
q4
4
q1
2
p2
1
p4
1
q3
3
a
a
a
a
q2
2
p1
1
p3
1
q4
4
q1
2
p2
1
p4
1
q3
3
a
a
a
a
Figure: Hopcroft minimization method
Moore Iteratively computes equivalence class of each state as
p ≡i q ⇔ p ≡i−1 q ∧ ∀a ∈ Σδ(p, a) ≡i−1 q
Map-Reduce a parallel programming model that can work over large
clusters of commodity computers.
Challenges
Huge amount of data
Complex graph based structure
Iterative problem
Moore’s algorithm in MapReduce
Hopcroft’s algorithm in MapReduce
Communication Cost of the Algorithms
Communication cost can be calculated as:
Number of rounds × (Replication rate × Input size + Output size)
Number of rounds: O(n)
Replication rate: O(1)
Input size = Output size
Moore-MR: Record size of output of first job is Θ(k log n). Thus communication cost of each round is
Θ(k2
n log n). Therefore total comunication cost is O(k2
n2
log n).
Hopcroft-MR: There are O(n log n) updates in parallel execution at each round. Thus it requires O(kn2
log n)
bits of communication.
Experimental Results
Findings
Figure: Evenly distributed DFA
Figure: Effect of number of rounds
Figure: Effect of number of alphabet symbols
Figure: Effect of skewness
Conclusion
Hopcroft-MR outperforms Moore-MR
in communication cost when the cardinality of the alphabet is
at least 16,
in wall-clock time when the cardinality is at least 32
in communication cost when number of rounds is more than
128
Both algorithms are equally sensitive to skewness in the
input data.
Future work,
There is potential to reduce skew-sensitiveness in Moore-MR.
Investigate the average communication cost
Reducer capacity vs. Number of rounds
Presented at ACM-Sigmod Beyond MR Workshopin San Francisco Ca., July 2016 {grahne, s_harraf, h_iraj, moa_ali}@encs.concordia.ca