SlideShare a Scribd company logo
1 of 21
Download to read offline
Motivation
Our Solution
Evaluation
Future Work
Distributed Formal Concept Analysis
Algorithms Based on an Iterative MapReduce
Framework
Biao Xu Ruairí de Fréin Eric Robson Mícheál Ó Foghlú
Telecommunications Software & Systems Group
Waterford Institute of Technology
ICFCA 2012 Leuven, Blegium
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Outline
1 Motivation
The Basic Problems of Current FCA Algorithms
Related Work
2 Our Solution
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
3 Evaluation
4 Future Work
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
The Basic Problems of Current FCA Algorithms
Related Work
Outline
1 Motivation
The Basic Problems of Current FCA Algorithms
Related Work
2 Our Solution
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
3 Evaluation
4 Future Work
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
The Basic Problems of Current FCA Algorithms
Related Work
Apply FCA algorithms in real world applications
Time-consuming to large and high-demension data.
Table: Execution time of traditional FCA algorithms (in seconds).
Dataset mushroom anon-web census-income
size 8124×125 32711×294 103950×133
NextClosure 618 14671 18230
CloseByOne 2543 656 7465
Hard to deal with distributed database.
Data volume
Communication
Privacy
Security
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
The Basic Problems of Current FCA Algorithms
Related Work
Outline
1 Motivation
The Basic Problems of Current FCA Algorithms
Related Work
2 Our Solution
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
3 Evaluation
4 Future Work
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
The Basic Problems of Current FCA Algorithms
Related Work
Few work on distributed FCA algorithms
A distributed version of CloseByOne based on Hadoop
MapReduce.
Petr Krajca, etc. Distributed Algorithm for Computing
Formal Concepts Using Map-Reduce Framework. IDA,
2009.
Differences in our work.
using an iterative MapReduce, Twister.
mining formal concepts in the least iterations.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Outline
1 Motivation
The Basic Problems of Current FCA Algorithms
Related Work
2 Our Solution
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
3 Evaluation
4 Future Work
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Features of MapReduce Framework
Divide and conquer strategy: map + reduce function.
Table: Partitioned datasets S1 and S2
S1 or (OS1
, P, IS1
)
a b c d e f g
1 × × × ×
2 × × × ×
3 × × × × ×
S2 or (OS2
, P, IS2
)
a b c d e f g
4 × × ×
5 × × × ×
6 × × × ×
Move algorithms to nodes other than datasets.
Utilize a cluster not only single machine.
Fault tolerance.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
MapReduce Data Flow
Split 0 map
reduce Part 0
reduce Part 1
Split 1 map
Split 2 map
Input
Output
node 0
sort
copy
merge
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Twister: an Iterative MapReduce Runtime
A lightweight MapReduce runtime developed by Indiana
University.
Efficient support for Iterative MapReduce computations.
Table: Comparison between Twister and Hadoop
Twister Hadoop
Long running map/reduce task Single step map/reduce
Iterative supporting Jobs chaining
Static & dynamic data Static data only
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Twister Architecture
Twister Daemon
Worker Pool
Master Node
Main Program
Twister Driver
Twister Daemon
Worker Pool
map
reduce
map map
reduce reduce
Cacheable Tasks
•••
•••
•••
Local Disk Local Disk
Data distribution,
collection, and
partition file creation
Worker Node
B
B
B
Worker Node
Pub/sub
Broker Network
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Outline
1 Motivation
The Basic Problems of Current FCA Algorithms
Related Work
2 Our Solution
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
3 Evaluation
4 Future Work
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Decompose the FCA Algorithm
Map phase produces local concepts, FY
Sn
.
Reduce phase generates global concepts by merging local
concepts from mappers.
Theorem: Given the closures
FY
S1
, · · · , FY
Sn
from n disjoint partitions,
FY
S = FY
S1
∩ · · · ∩ FY
Sn
.
Named our algorithms with MR : MRCbo, MRGanter,
MRGanter+.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
MRGanter Work Flow
Data Split 1
Map
computeClosure()
while(!isLastClosure(Closure))
runMapReduce()
•••
Reduce 1
merging()
check()
Data Split n
Map
computeClosure()
Reduce n
merging()
check()
Closure
•••
DD
D
S S
D
atr1, localClosure1
↓
atrj, localClosurej
atr1, localClosure1
↓
atri, localClosurei
Figure: Static data labeled by S and dynamic data labeled by D.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Adopt Iterative MapReduce Framework
FCA Algorithms Adaptation
Running example of MRGanter and MRGanter+.
d p_i F1 from S1 F2 from S2 F
∅
g {c,g} {b,c,f,g} {c,g}
f {b,d,f} {f} {f}
e {a,c,e,g} {d,e} {e}
d {b,d,f} {d,e} {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{f}
g {b,c,d,f,g} {b,c,f,g} {b,c,f,g}
e {a,c,e,g} {d,e} {e}
d {b,d,f} {d,e} {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{e}
g {a,c,e,g} {a,. . . ,g} {a,c,e,g}
f {a,. . . ,g} {a,d,e,f} {a,d,e,f}
d {b,d,f} {d,e} {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{d}
g {b,c,d,f,g} {a,. . . ,g} {b,c,d,f,g}
f {b,d,f} {a,d,e,f} {d,f}
e {a,. . . ,g} {d,e} {d,e}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
d p_i F1 from S1 F2 from S2 F
∅
g {c,g} {b,c,f,g} {c,g}
f {b,d,f} {f} {f}
e {a,c,e,g} {d,e} {e}
d {b,d,f} {d,e {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{cg}
f {b,c,d,f,g} {b,c,f,g} {b,c,f,g}
e {a,c,e,g} {a,. . . ,g} {a,c,e,g}
d {b,c,d,f,g} {a,. . . ,g} {b,c,d,f,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{f}
g {b,c,d,f,g} {b,c,f,g} {b,c,f,g}
e {a,c,e,g} {d,e} {e}
d {b,d,f} {d,e} {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
{e}
g {a,c,e,g} {a,. . . ,g} {a,c,e,g}
f {a,. . . ,g} {a,d,e,f} {a,d,e,f}
d {b,d,f} {d,e} {d}
c {c,g} {b,c,f,g} {c,g}
b {b,d,f} {b} {b}
a {a} {a,d,e,f} {a}
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Efficiency of MR
Table: Execution time: Distributed algorithms are the fastest (in
seconds) on certain number of machines (in round brackets).
Dataset mushroom anon-web census-income
concepts 219010 129009 96531
Density 17.36% 1.03% 6.7%
NextClosure 618 14671 18230
CloseByOne 2543 656 7465
MRCbo 241 (11) 693 (11) 803 (11)
MRGanter 20269 (5) 20110 (3) 9654 (11)
MRGanter+ 198 (9) 496 (9) 358 (11)
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Scalability of MR (1)
0 2 4 6 8 10 12
10
2
10
3
10
4
10
5
Nodes (Count)
CPUTime(Second)
MRGanter+
MRCbo
MRGanter
Figure: Mushroom dataset: comparison of MRGanter+, MRCbo and
MRGanter. MRGanter+ outperforms MRCbo and MRGanter when
dense data is processed.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Scalability of MR (2)
0 2 4 6 8 10 12
10
2
10
3
10
4
10
5
Nodes (Count)
CPUTime(Second)
MRGanter+
MRCbo
MRGanter
Figure: Anon-web dataset: comparison of MRGanter+, MRCbo and
MRGanter. MRGanter+ is faster when more than 3 nodes are used.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Scalability of MR (3)
0 2 4 6 8 10 12
10
2
10
3
10
4
10
5
Nodes (Count)
CPUTime(Second)
MRGanter+
MRCbo
MRGanter
Figure: Census dataset: comparison of MRGanter+, MRCbo and
MRGanter. MRGanter+ is fastest when a large dataset is processed.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Future Work
Explore the effect of data distribution between cluster
nodes.
Examine MR performance with larger dataset sizes.
Extend our approach by reducing the size of intermediate
data.
Biao Xu, etc. Distributed FCA Algorithms MR
Motivation
Our Solution
Evaluation
Future Work
Thank you
Questions?
Biao Xu, etc. Distributed FCA Algorithms MR

More Related Content

What's hot

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorUnited States Air Force Academy
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...sleepy_yoshi
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks 신동 강
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsYoonho Lee
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONaftab alam
 
Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2jalle6
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
Building and road detection from large aerial imagery
Building and road detection from large aerial imageryBuilding and road detection from large aerial imagery
Building and road detection from large aerial imageryShunta Saito
 
Histogram based Enhancement
Histogram based Enhancement Histogram based Enhancement
Histogram based Enhancement Vivek V
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainMostafa G. M. Mostafa
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsNAVER Engineering
 
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopYu Liu
 
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs Christopher Morris
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationChristopher Morris
 
Lect 03 - first portion
Lect 03 - first portionLect 03 - first portion
Lect 03 - first portionMoe Moe Myint
 

What's hot (20)

Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
 
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATIONSCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
SCALABLE PATTERN MATCHING OVER COMPRESSED GRAPHS VIA DE-DENSIFICATION
 
gSpan algorithm
gSpan algorithmgSpan algorithm
gSpan algorithm
 
Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
Building and road detection from large aerial imagery
Building and road detection from large aerial imageryBuilding and road detection from large aerial imagery
Building and road detection from large aerial imagery
 
Histogram based Enhancement
Histogram based Enhancement Histogram based Enhancement
Histogram based Enhancement
 
Digital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial DomainDigital Image Processing: Image Enhancement in the Spatial Domain
Digital Image Processing: Image Enhancement in the Spatial Domain
 
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
 
Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...
 
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
 
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
 
Lect 03 - first portion
Lect 03 - first portionLect 03 - first portion
Lect 03 - first portion
 

Similar to Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...Databricks
 
Introduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsIntroduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsShao-Yen Hung
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache CalciteJulian Hyde
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리NAVER D2
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxDatabricks
 
Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For RegressionTiffany Sandoval
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCAapo Kyrölä
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Shardinginside-BigData.com
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...Aboul Ella Hassanien
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Spark Summit
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedRevolution Analytics
 
Optimization Software and Systems for Operations Research: Best Practices and...
Optimization Software and Systems for Operations Research: Best Practices and...Optimization Software and Systems for Operations Research: Best Practices and...
Optimization Software and Systems for Operations Research: Best Practices and...Bob Fourer
 

Similar to Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework (20)

ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
 
Introduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsIntroduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning Algorithms
 
LalitBDA2015V3
LalitBDA2015V3LalitBDA2015V3
LalitBDA2015V3
 
Se notes
Se notesSe notes
Se notes
 
Pydata talk
Pydata talkPydata talk
Pydata talk
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache Calcite
 
1406
14061406
1406
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리
 
GeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony FoxGeoMesa on Apache Spark SQL with Anthony Fox
GeoMesa on Apache Spark SQL with Anthony Fox
 
Questions On The Equation For Regression
Questions On The Equation For RegressionQuestions On The Equation For Regression
Questions On The Equation For Regression
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
6. Implementation
6. Implementation6. Implementation
6. Implementation
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Sharding
 
Ssbse10.ppt
Ssbse10.pptSsbse10.ppt
Ssbse10.ppt
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Optimization Software and Systems for Operations Research: Best Practices and...
Optimization Software and Systems for Operations Research: Best Practices and...Optimization Software and Systems for Operations Research: Best Practices and...
Optimization Software and Systems for Operations Research: Best Practices and...
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

  • 1. Motivation Our Solution Evaluation Future Work Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework Biao Xu Ruairí de Fréin Eric Robson Mícheál Ó Foghlú Telecommunications Software & Systems Group Waterford Institute of Technology ICFCA 2012 Leuven, Blegium Biao Xu, etc. Distributed FCA Algorithms MR
  • 2. Motivation Our Solution Evaluation Future Work Outline 1 Motivation The Basic Problems of Current FCA Algorithms Related Work 2 Our Solution Adopt Iterative MapReduce Framework FCA Algorithms Adaptation 3 Evaluation 4 Future Work Biao Xu, etc. Distributed FCA Algorithms MR
  • 3. Motivation Our Solution Evaluation Future Work The Basic Problems of Current FCA Algorithms Related Work Outline 1 Motivation The Basic Problems of Current FCA Algorithms Related Work 2 Our Solution Adopt Iterative MapReduce Framework FCA Algorithms Adaptation 3 Evaluation 4 Future Work Biao Xu, etc. Distributed FCA Algorithms MR
  • 4. Motivation Our Solution Evaluation Future Work The Basic Problems of Current FCA Algorithms Related Work Apply FCA algorithms in real world applications Time-consuming to large and high-demension data. Table: Execution time of traditional FCA algorithms (in seconds). Dataset mushroom anon-web census-income size 8124×125 32711×294 103950×133 NextClosure 618 14671 18230 CloseByOne 2543 656 7465 Hard to deal with distributed database. Data volume Communication Privacy Security Biao Xu, etc. Distributed FCA Algorithms MR
  • 5. Motivation Our Solution Evaluation Future Work The Basic Problems of Current FCA Algorithms Related Work Outline 1 Motivation The Basic Problems of Current FCA Algorithms Related Work 2 Our Solution Adopt Iterative MapReduce Framework FCA Algorithms Adaptation 3 Evaluation 4 Future Work Biao Xu, etc. Distributed FCA Algorithms MR
  • 6. Motivation Our Solution Evaluation Future Work The Basic Problems of Current FCA Algorithms Related Work Few work on distributed FCA algorithms A distributed version of CloseByOne based on Hadoop MapReduce. Petr Krajca, etc. Distributed Algorithm for Computing Formal Concepts Using Map-Reduce Framework. IDA, 2009. Differences in our work. using an iterative MapReduce, Twister. mining formal concepts in the least iterations. Biao Xu, etc. Distributed FCA Algorithms MR
  • 7. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Outline 1 Motivation The Basic Problems of Current FCA Algorithms Related Work 2 Our Solution Adopt Iterative MapReduce Framework FCA Algorithms Adaptation 3 Evaluation 4 Future Work Biao Xu, etc. Distributed FCA Algorithms MR
  • 8. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Features of MapReduce Framework Divide and conquer strategy: map + reduce function. Table: Partitioned datasets S1 and S2 S1 or (OS1 , P, IS1 ) a b c d e f g 1 × × × × 2 × × × × 3 × × × × × S2 or (OS2 , P, IS2 ) a b c d e f g 4 × × × 5 × × × × 6 × × × × Move algorithms to nodes other than datasets. Utilize a cluster not only single machine. Fault tolerance. Biao Xu, etc. Distributed FCA Algorithms MR
  • 9. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation MapReduce Data Flow Split 0 map reduce Part 0 reduce Part 1 Split 1 map Split 2 map Input Output node 0 sort copy merge Biao Xu, etc. Distributed FCA Algorithms MR
  • 10. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Twister: an Iterative MapReduce Runtime A lightweight MapReduce runtime developed by Indiana University. Efficient support for Iterative MapReduce computations. Table: Comparison between Twister and Hadoop Twister Hadoop Long running map/reduce task Single step map/reduce Iterative supporting Jobs chaining Static & dynamic data Static data only Biao Xu, etc. Distributed FCA Algorithms MR
  • 11. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Twister Architecture Twister Daemon Worker Pool Master Node Main Program Twister Driver Twister Daemon Worker Pool map reduce map map reduce reduce Cacheable Tasks ••• ••• ••• Local Disk Local Disk Data distribution, collection, and partition file creation Worker Node B B B Worker Node Pub/sub Broker Network Biao Xu, etc. Distributed FCA Algorithms MR
  • 12. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Outline 1 Motivation The Basic Problems of Current FCA Algorithms Related Work 2 Our Solution Adopt Iterative MapReduce Framework FCA Algorithms Adaptation 3 Evaluation 4 Future Work Biao Xu, etc. Distributed FCA Algorithms MR
  • 13. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Decompose the FCA Algorithm Map phase produces local concepts, FY Sn . Reduce phase generates global concepts by merging local concepts from mappers. Theorem: Given the closures FY S1 , · · · , FY Sn from n disjoint partitions, FY S = FY S1 ∩ · · · ∩ FY Sn . Named our algorithms with MR : MRCbo, MRGanter, MRGanter+. Biao Xu, etc. Distributed FCA Algorithms MR
  • 14. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation MRGanter Work Flow Data Split 1 Map computeClosure() while(!isLastClosure(Closure)) runMapReduce() ••• Reduce 1 merging() check() Data Split n Map computeClosure() Reduce n merging() check() Closure ••• DD D S S D atr1, localClosure1 ↓ atrj, localClosurej atr1, localClosure1 ↓ atri, localClosurei Figure: Static data labeled by S and dynamic data labeled by D. Biao Xu, etc. Distributed FCA Algorithms MR
  • 15. Motivation Our Solution Evaluation Future Work Adopt Iterative MapReduce Framework FCA Algorithms Adaptation Running example of MRGanter and MRGanter+. d p_i F1 from S1 F2 from S2 F ∅ g {c,g} {b,c,f,g} {c,g} f {b,d,f} {f} {f} e {a,c,e,g} {d,e} {e} d {b,d,f} {d,e} {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {f} g {b,c,d,f,g} {b,c,f,g} {b,c,f,g} e {a,c,e,g} {d,e} {e} d {b,d,f} {d,e} {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {e} g {a,c,e,g} {a,. . . ,g} {a,c,e,g} f {a,. . . ,g} {a,d,e,f} {a,d,e,f} d {b,d,f} {d,e} {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {d} g {b,c,d,f,g} {a,. . . ,g} {b,c,d,f,g} f {b,d,f} {a,d,e,f} {d,f} e {a,. . . ,g} {d,e} {d,e} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} d p_i F1 from S1 F2 from S2 F ∅ g {c,g} {b,c,f,g} {c,g} f {b,d,f} {f} {f} e {a,c,e,g} {d,e} {e} d {b,d,f} {d,e {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {cg} f {b,c,d,f,g} {b,c,f,g} {b,c,f,g} e {a,c,e,g} {a,. . . ,g} {a,c,e,g} d {b,c,d,f,g} {a,. . . ,g} {b,c,d,f,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {f} g {b,c,d,f,g} {b,c,f,g} {b,c,f,g} e {a,c,e,g} {d,e} {e} d {b,d,f} {d,e} {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} {e} g {a,c,e,g} {a,. . . ,g} {a,c,e,g} f {a,. . . ,g} {a,d,e,f} {a,d,e,f} d {b,d,f} {d,e} {d} c {c,g} {b,c,f,g} {c,g} b {b,d,f} {b} {b} a {a} {a,d,e,f} {a} Biao Xu, etc. Distributed FCA Algorithms MR
  • 16. Motivation Our Solution Evaluation Future Work Efficiency of MR Table: Execution time: Distributed algorithms are the fastest (in seconds) on certain number of machines (in round brackets). Dataset mushroom anon-web census-income concepts 219010 129009 96531 Density 17.36% 1.03% 6.7% NextClosure 618 14671 18230 CloseByOne 2543 656 7465 MRCbo 241 (11) 693 (11) 803 (11) MRGanter 20269 (5) 20110 (3) 9654 (11) MRGanter+ 198 (9) 496 (9) 358 (11) Biao Xu, etc. Distributed FCA Algorithms MR
  • 17. Motivation Our Solution Evaluation Future Work Scalability of MR (1) 0 2 4 6 8 10 12 10 2 10 3 10 4 10 5 Nodes (Count) CPUTime(Second) MRGanter+ MRCbo MRGanter Figure: Mushroom dataset: comparison of MRGanter+, MRCbo and MRGanter. MRGanter+ outperforms MRCbo and MRGanter when dense data is processed. Biao Xu, etc. Distributed FCA Algorithms MR
  • 18. Motivation Our Solution Evaluation Future Work Scalability of MR (2) 0 2 4 6 8 10 12 10 2 10 3 10 4 10 5 Nodes (Count) CPUTime(Second) MRGanter+ MRCbo MRGanter Figure: Anon-web dataset: comparison of MRGanter+, MRCbo and MRGanter. MRGanter+ is faster when more than 3 nodes are used. Biao Xu, etc. Distributed FCA Algorithms MR
  • 19. Motivation Our Solution Evaluation Future Work Scalability of MR (3) 0 2 4 6 8 10 12 10 2 10 3 10 4 10 5 Nodes (Count) CPUTime(Second) MRGanter+ MRCbo MRGanter Figure: Census dataset: comparison of MRGanter+, MRCbo and MRGanter. MRGanter+ is fastest when a large dataset is processed. Biao Xu, etc. Distributed FCA Algorithms MR
  • 20. Motivation Our Solution Evaluation Future Work Future Work Explore the effect of data distribution between cluster nodes. Examine MR performance with larger dataset sizes. Extend our approach by reducing the size of intermediate data. Biao Xu, etc. Distributed FCA Algorithms MR
  • 21. Motivation Our Solution Evaluation Future Work Thank you Questions? Biao Xu, etc. Distributed FCA Algorithms MR