In our previous work an efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) was proposed. This algorithm is a modified version of the basic algorithm for OAC- triclustering approach; it has linear time and memory complexities. In this paper we parallelise it via map-reduce framework in order to make it suitable for big datasets. The results of computer experiments show the efficiency of the proposed algorithm; for example, it outperforms the online counterpart on Bibsonomy dataset with ≈ 800, 000 triples.
1. Putting OAC-triclustering on MapReduce
Sergey Zudin, Dmitry V. Gnatyshak, and Dmitry I. Ignatov
National Research University Higher School of Economics, Russian Federation
Faculty of Computer Science
CLA 2015, Clermont-Ferrand, France
October 13-16
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 1 / 39
2. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 2 / 39
3. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 3 / 39
4. Motivation
Big amount of multimodal data:
Gene expression data
Folksonomies
Recommender Systems
Communities in multi-mode (social) networks
Pattern mining in relational databases
. . .
Non-binary data can be scaled (possibly increasing the dimensionality)
Increasing amount of big data: fast and/or distributed algorithms are
required (linear or sublinear, one-pass)
Existing methods: finding all n-sets (mulitimodal clusters) satisfying some
conditions (often the exponential number of patterns)
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 4 / 39
5. Motivation
IMDB example, [Mirkin et al., 2011]
Clump Movie-Keyword-Genre
Bicluster
{12 Angry Men (1957), To Kill a Mockingbird (1962), Wit-
ness for the Prosecution (1957)}, {Murder, Trial}, {n/a }
Tricluster
{12 Angry Men (1957), Double Indemnity (1944), China-
town (1974), The Big Sleep (1946), Witness for the Pros-
ecution (1957), Dial M for Murder (1954), Shadow of a
Doubt (1943) }, {Murder, Trial, Widow, Marriage, Private
detective, Blackmail, Letter}, {Crime, Drama, Thriller,
Mystery, Film-Noir }
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 5 / 39
6. Previous and related work
A short (not full) list
Triadic FCA [Wille, 1995; Lehman and Wille,1995] and Polyadic FCA
[Voutsadakis, 2002]
TRIAS [J¨aeschke et al., 2006] for mining (frequent) triconcepts
DataPeeler for closed n-sets [Cerf et al., 2009], MultiDupeHack [Cerf et al,
2013]
TriBox [Mirkin et al., 2011] for mining dense triboxes with LS criterion
Box OAC-triclustering and Spectral Triclustering [Ignatov et al., 2011,2013]
Multi-way set enumeration in weight tensors [Sch¨olkopf et al, 2011]
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 6 / 39
7. Previous and related work
A short (not full) list
Quadri-concepts for personalised folksnomies [Jelassi et al., 2012, 2013]
Prime OAC-triclustering [Gnatyshak et al., 2012–2014]
Triadic Boolean tensor factorisation [Miettinen et al., 2011; Belohlavek et al.,
2013] and Boolean tensor clustering [Miettinen et al., 2015]
Closed and connected patterns in multi-relational data. [Spyropoulu et al.,
2011–14]
Triadic FCA and triclustering: Searching for optimal patterns. Machine
Learning journal [Ignatov et al., 2015] and CLA 2013
. . .
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 7 / 39
8. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 8 / 39
9. Prime OAC-triclustering
Formal concept analysis: triadic case
Definition
Let G, M, B be sets and the ternary relation I be a subset of their Cartesian
product: I ⊆ G × M × B. Then the tuple K = (G, M, B, I) is called a triadic
formal context.
G is a set of objects, M is a set of attributes, B is a set of conditions.
GM m1 m2 m3 m1 m2 m3 m1 m2 m3
g1 x x x x x x x x
g2 x x x x x
g3 x x x x
g4 x x x x x x
B b1 b2 b3
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 9 / 39
10. Prime OAC-triclustering
Formal concept analysis: triadic case
Definition
Galois operators (prime operators) are defined in similar way to the dyadic case:
2G
→ 2M
× 2B
2G
× 2M
→ 2B
2M
→ 2G
× 2B
2G
× 2B
→ 2M
2B
→ 2G
× 2M
2M
× 2B
→ 2G
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 9 / 39
11. Prime OAC-triclustering
Formal concept analysis: triadic case
GM m1 m2 m3 m1 m2 m3 m1 m2 m3
g1 x x x x x x x x
g2 x x x x x
g3 x x x x
g4 x x x x x x
B b1 b2 b3
({g1, g2}, {m1, m2})′
= {b1, b3}
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 9 / 39
12. Prime OAC-triclustering
Formal concept analysis: triadic case
GM m1 m2 m3 m1 m2 m3 m1 m2 m3
g1 x x x x x x x x
g2 x x x x x
g3 x x x x
g4 x x x x x x
B b1 b2 b3
m′
2 = {(g1, b1), (g2, b1), (g3, b1), (g1, b2), (g1, b3), (g2, b3), (g4, b3)}
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 9 / 39
13. Prime OAC-triclustering
Formal concept analysis: triadic case
Definition
The triple (X, Y , Z) is called triadic formal concept of the context
K = (G, M, B, I), if X ⊆ G,Y ⊆ M, Z ⊆ B, (X, Y )′
= Z, (X, Z)′
= Y ,
(Y , Z)′
= X.
X is called (formal) extent, Y — (formal) intent, Z — (formal) modus.
GM m1 m2 m3 m1 m2 m3 m1 m2 m3
g1 x x x x x x x x
g2 x x x x x
g3 x x x x
g4 x x x x x x
B b1 b2 b3
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 9 / 39
14. Prime OAC-triclustering
Basic algorithm [Gnatyshak et al., 2013]
This method uses the following types of prime operators (for the context
K = (G, M, B, I)):
(g, m)′
= {b ∈ B | (g, m, b) ∈ I},
(g, b)′
= {m ∈ M | (g, m, b) ∈ I},
(m, b)′
= {g ∈ G | (g, m, b) ∈ I}
Definition
Then the triple T = ((m, b)′
, (g, b)′
, (g, m)′
) is called the prime-based
OAC-tricluster for a triple (g, m, b) ∈ I. The sets of tricluster are called,
respectively, tricluster extent, intent, and modus. Triple (g, m, b) is called a
generating triple of the tricluster T.
Definition
Density of a tricluster: ρ(X, Y , Z) = |I∩(X×Y ×Z)|
|X||Y ||Z|
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 10 / 39
16. Prime OAC-triclustering
Basic algorithm
Input: K = (G, M, B, I) — triadic context;
ρmin — density threshold
Output: T = {T = (X, Y , Z)}
1: T := ∅
2: for all (g, m): g ∈ G,m ∈ M do
3: PrimesObjAttr[g, m] = (g, m)′
4: end for
5: for all (g, b): g ∈ G,b ∈ B do
6: PrimesObjCond[g, b] = (g, b)′
7: end for
8: for all (m, b): m ∈ M,b ∈ B do
9: PrimesAttrCond[m, b] = (m, b)′
10: end for
11: for all (g, m, b) ∈ I do
12: T = (PrimesAttrCond[m, b], PrimesObjCond[g, b], PrimesObjAttr[g, m])
13: Tkey = hash(T)
14: if Tkey ̸∈ T .keys ∧ ρ(T) ≥ ρmin then
15: T [Tkey] := T
16: end if
17: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 12 / 39
17. Prime OAC-triclustering
Online version of the algorithm [Gnatyshak et al., 2014]
Let K = (G, M, B, I) be a triadic context. We do not know G, M, B, I, or their
cardinalities in advance.
Input on each iteration: {(g, m, b)} = J ⊆ I.
Goal: maintain an updated version of the results and efficiently update them when
new triples are received.
We need to keep in memory the results of prime operators’ application (prime
sets):
PrimesObjAttr — dictionary with elements of type ((g, m), {b ∈ B}), g ∈ G,
m ∈ M;
PrimesObjCond — dictionary with elements of type ((g, b), {m ∈ M}),
g ∈ G, b ∈ B;
PrimesAttrCond — dictionary with elements of type ((m, b), {g ∈ G}),
m ∈ M, b ∈ B.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 13 / 39
18. Prime OAC-triclustering
Online version of the algorithm
Remark
In this case we need to consider triclusters based on different triples different, even
if their extents, intents, and modi are equal.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 14 / 39
19. Prime OAC-triclustering
Online version of the algorithm
Algorithm of triples addition:
Input: J is a set of triples to add;
T = {T = (∗X, ∗Y , ∗Z)} is a current tricluster set;
PrimesObjAttr, PrimesObjCond, PrimesAttrCond;
Output: T = {T = (∗X, ∗Y , ∗Z)};
PrimesObjAttr, PrimesObjCond, PrimesAttrCond;
1: for all (g, m, b) ∈ J do
2: PrimesObjAttr[g, m] := PrimesObjAttr[g, m] ∪ b
3: PrimesObjCond[g, b] := PrimesObjCond[g, b] ∪ m
4: PrimesAttrCond[m, b] := PrimesAttrCond[m, b] ∪ g
5: T :=
T ∪ (&PrimesAttrCond[m, b], &PrimesObjCond[g, b], &PrimesObjAttr[g, m])
6: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 15 / 39
20. Prime OAC-triclustering
Online version of the algorithm
A user may require to remove the triclusters with the same extent, intent and
modus at the post-processing stage. At this stage we can also check various
conditions (for instance, minimal density condition).
Input: T = {T = (∗X, ∗Y , ∗Z)} is a current tricluster set;
Output: T = {T = (∗X, ∗Y , ∗Z)} — processed tricluster hash-set;
1: for all T ∈ T do
2: Compute hash(T)
3: if hash(T) ̸∈ T .keys() then
4: T := T ∪ T
5: end if
6: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 16 / 39
21. Prime OAC-triclustering
Online version of the algorithm
Complexity summary:
Time complexity: O(|I|) (as there is a constant number of operations on
each step);
More precisely: 8|I| operations in total;
1 Modification of 3 prime sets (3);
2 Creation of a new tricluster (1);
3 Addition of pointers to its extent, intent, and modus (3);
4 Addition of the tricluster to the set of all triclusters (1).
Memory complexity: O(|I|) (as we need to keep in memory only prime sets,
|I| elements in each dictionary + keys).
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 17 / 39
32. Prime OAC-triclustering
Online version of the algorithm
Postprocessing:
1 T(g1,m1,b1) = (g1, g2, m1, m2, b1) ← add
2 T(g1,m2,b1) = (g1, g2, m1, m2, b1, b2) ← add
3 T(g2,m1,b1) = (g1, g2, m1, m2, b1, b2) ← the same as T(g1,m2,b1), skip
4 T(g2,m2,b1) = (g1, g2, m1, m2, b1, b2) ← the same as T(g1,m2,b1), skip
5 T(g3,m3,b1) = (g3, m3, b1, b2) ← add
6 T(g1,m2,b2) = (g1, g2, m2, b1, b2) ← add
7 T(g2,m1,b2) = (g2, m1, m2, b1, b2) ← add
8 T(g2,m2,b2) = (g1, g2, m1, m2, b1, b2) ← the same as T(g1,m2,b1), skip
9 T(g3,m3,b2) = (g3, m3, b1, b2) ← the same as T(g3,m3,b1), skip
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 18 / 39
33. Prime OAC-triclustering
Online version of the algorithm
The final output set of triclusters:
1 T1 = ({g1, g2}, {m1, m2}, {b1})
2 T2 = ({g1, g2}, {m1, m2}, {b1, b2})
3 T3 = ({g3}, {m3}, {b1, b2})
4 T4 = ({g1, g2}, {m2}, {b1, b2})
5 T5 = ({g2}, {m1, m2}, {b1, b2})
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 18 / 39
34. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 19 / 39
36. MapReduce Technology
MapReduce example
Figure: Word counting. Source:
http://blog.trifork.com/2009/08/04/introduction-to-hadoop/
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 21 / 39
37. MapReduce Technology
Communication costs: Mining of Massive Datasets [Leskovec et al., 2013]
Chapter 2: MapReduce and the New Software Stack
“Replication Rate and Reducer Size: It is often convenient to measure
communication by the replication rate, which is the communication per input.
Also, the reducer size is the maximum number of inputs associated with any
reducer. For many problems, it is possible to derive a lower bound on replication
rate as a function of the reducer size.”
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 22 / 39
38. MapReduce Implementation
The previous lattice-oriented M/R implementations
A version of Close-by-One algorithm was ported to M/R framework [Krajca
& Vychodil, 2009]
A M/R algorithm for computation of closed cube lattices was proposed
[Kudryavcev & Kuznecov, 2009]
[Xu et al., 2012] demonstrated that iterative algorithms like Ganter’s
NextClosure can benefit from the usage of iterative M/R schemes
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 23 / 39
39. MapReduce Implementation
Technologies and code repositories
Technologies used
Apache Hadoop 1
Apache Maven (framework for automatic project assembling)
Apache Commons (for work with extended Java collections)
Google Guava (utilities and data structures)
Jackson JSON (open-source library for transformation of object-oriented
representation of an object like tricluster to string)
TypeTools (for real-time type resolution of inbound and outbound key-value
pairs)
. . .
Implementations
Source 1: “Chaining-job” module2
Source 2: M/R-based OAC Triclustering3
1http://hadoop.apache.org/
2https://github.com/zydins/chaining-job
3https://github.com/zydins/DistributedTriclustering
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 24 / 39
40. Two-stage MapReduce Implementation
Distributed OAC-triclustering: First Map
Input: S is a set of input triples as strings;
r is a number of reducers;
i is a grouping index (objects, attributes or conditions).
Output: ˜J is a list of ⟨key, triple⟩ pairs.
1: for all s ∈ S do
2: t := transform(s)
3: key := hash(t[i]) mod r
4: ˜J := ˜J ∪ {⟨key, t⟩}
5: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 25 / 39
41. Two-stage MapReduce Implementation
Distributed OAC-triclustering: First Reduce
Input: J is a list of triples (for a certain key);
T = {T = (X, Y , Z)} is a current set of triclusters;
PrimesOA, PrimesOC, PrimesAC.
Output: file of strings – encoded ⟨triple, tricluster⟩ pairs.
1: Primes ← initialise a new multimap
2: for all (g, m, b) ∈ J do
3: Primes[g, m] := Primes[g, m] ∪ {b}
4: Primes[g, b] := Primes[g, b] ∪ {m}
5: Primes[m, b] := Primes[m, b] ∪ {g}
6: end for
7: for all (g, m, b) ∈ J do
8: T := (set(Primes[m, b]), set(Primes[g, b]), set(Primes[g, m]))
9: s := encode(⟨(g, m, b), T⟩)
10: store s
11: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 26 / 39
42. Two-stage MapReduce Implementation
Distributed OAC-triclustering: Second Map
Input: S is a list of strings.
Output: ˜T is an list of ⟨tricluster, tricluster⟩ pairs.
1: Primes ← initialise a new multimap
2: for all s ∈ S do
3: ⟨(g, m, b), T⟩ := decode(s)
4: update Primes multimap appropriately
5: I := I ∪ {(g, m, b)}
6: end for
7: for all (g, m, b) ∈ I do
8: T := (set(Primes[m, b]), set(Primes[g, b]), set(Primes[g, m]))
9: ˜T := ˜T ∪ {⟨T, T⟩}
10: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 27 / 39
43. Two-stage MapReduce Implementation
Distributed OAC-triclustering: Second Reduce
Input: ˆT is a list of ⟨tricluster, list of triclusters⟩ pairs.
Output: File with a final set of triclusters {T = (X, Y , Z)}.
1: for all ⟨T, [T, . . . , T]⟩ ∈ ˆT do
2: store T
3: end for
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 28 / 39
44. Two-stage MapReduce Implementation
Communication costs
The time complexity of the M/R solution is composed from two terms for
each stage: O(|I|/r) (or O(|I|)) and O(|I|).
The replication rate for the first M/R stage r1 = 1 (each triple is passed as
one key-value pair), the reducer size q1 = |I|/r
The replication rate for the second M/R stage is r2 = 1 (it assigns one
key-value pair for each tricluster), but the reducer size varies from qmin
2 = 1
(no duplicate triclusters) and qmax
2 = |I| (one final tricluster when all the
initial triples belong to one absolutely dense cuboid).
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 29 / 39
45. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 30 / 39
46. Experiments
Description of the experiments
OS X 10, 1.8 GHz Intel Core i5, 4 Gb 1600 MHz DDR3 and 8 Gb free space
on the hard drive (a typical commodity hardware).
Two M/R modes have been tested: sequential mode of tasks completion and
emulation of distributed one with 16 first reducers and 32 threads for the
second stage.
To evaluate the runtime more carefully, for each context the average result of
5 runs of the algorithms has been recorded.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 31 / 39
47. Experiments
Datasets
Synthetic datasets. 1) 20,000 triples (25 unique entities of each type); 2) 100,000 triples (50
unique entities of each type); 3) 1,000,000 triples (all possible combinations of 100 unique
entities of each type).
The 1st dataset contains duplicates since 25 × 25 × 25 gives only 15,625 unique triples. The 2nd
one contains less triples than 503 = 125, 000, the number of all possible combinations. The 3rd
one is an absolutely dense cuboid 100 × 100 × 100.
The 3rd dataset does not result in 3min(|G|,|M|,|B|) formal triconcepts, this is an example of the
worst case scenario for the second reducer (qmax
2 = |I|).
IMDB. Top-250 list of the best movies from Internet Movie Database
Bibsonomy. The data of bibsonomy.org from ECML PKDD discovery challenge 2008.
Context |G| |M| |B| # triples Density
20k 25 25 25 20,000 1
100k 50 50 50 100,000 0.8
1m 100 100 100 1,000,000 1
IMDB 250 795 22 3,818 0.00087
BibSonomy 2,337 67,464 28,920 816,197 1.8 · 10−7
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 32 / 39
49. Alternative MapReduce decomposition
Variant I: First stage
First Map: Finding primes. During this phase every input triple (g, m, b) is
encoded by three key-value pairs ⟨(g, m), b⟩, ⟨(g, b), m⟩, and ⟨(m, b), g⟩. These
pairs are passed to the first reducer.
The replication rate is r1 = 3.
First Reduce: Finding primes. This reducer fills three corresponding dictionaries
for primes of keys. So, for example, the first dictionary, PrimeOA contains
key-value pairs ⟨(g, m), {b1, b2, . . . , bn}⟩.
The reducer size is q1 = max(|G|, |M|, |B|)
The process can be stopped after the first reduce phase and all the triclusters
found as (Prime[g, m], Prime[g, b], Prime[m, b]) each by enumeration of
(g, m, b) ∈ I. However, to do it faster and keep the result for further
computation, it is possible to use M/R as well.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 34 / 39
50. Alternative MapReduce decomposition
Variant I: Second stage
Second Map: Tricluster generation. The second map does tricluster combining
job, i.e. for each triple (g, m, b) it composes the new key-value pair, ⟨(g, m, b), ∅⟩.
And for each pair of either type, ⟨(g, m), Prime[g, m]⟩, ⟨(g, b), Prime[g, b]⟩, and
⟨(m, b), Prime[m, b]⟩ it generates key-values pairs ⟨(g, m, ˜b), Prime[g, m]⟩,
⟨(g, ˜m, b), PrimeOC[g, b]⟩, and ⟨(˜g, m, b), Prime[m, b]⟩, where ˜g ∈ G, ˜m ∈ M,
and ˜b ∈ B.
r2 = (|I| + 3|G||M||B|)/(|I| + |G||M| + |G||B| + |M||B|) ≤
(ρ + 3)/(ρ + 3/max(|G|, |M|, |B|)), where ρ is the input tricontext density.
Second Reduce: Tricluster generation. The second reducer just assembles only
one value for each key (g, m, b), the generating triple, its tricluster, (Prime[g, m],
Prime[g, b], Prime[m, b]). If there is no key-value pair ⟨(g, m, b), ∅⟩ for a
particular triple (g, m, b), it does not output any key-value pair for the key.
The reducer size q2 is either 3 (no output) or 4 (tricluster assembled).
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 35 / 39
51. Alternative MapReduce decomposition
Variant II: Second stage
Second Map: Tricluster generation with duplicate generating triples.
Second map does tricluster combining job, i.e. for each triple (g, m, b) it
composes a new key-value pair:
⟨(Prime[g, m], Prime[g, b], Prime[m, b]), (g, m, b)⟩.
Second Map: Tricluster generation with duplicate generating triples.
The second reducer just groups values for each key: ⟨(X, Y , Z), {(g1, m1, b1), . . . ,
(gn, mn, bn)}⟩.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 36 / 39
52. Outline
1 Motivation and previous work
2 Prime OAC-triclustering
Triadic Formal concept analysis
Basic algorithm
Online version of the algorithm
3 OAC-triclustering on MapReduce
MapReduce technology
MapReduce implementation
4 Experiments
Description of the experiments
Datasets
Results
5 Conclusion
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 37 / 39
53. Conclusion and further work
MapReduce Prime OAC-triclustering implementation has been proposed.
Communication costs have been analysed.
Comparison of the online version and M/R one has been performed.
Further experiments are needed with other M/R variants and other
triclustering algorithms.
A proper comparison of the proposed OAC triclustering and noise tolerant
patterns in n-ary relations, e.g., by DataPeeler descendants [Cerf et al., 2013]
is not yet conducted.
S. Zudin et al. () OAC-triclustering on MapReduce CLA 2015 38 / 39