SlideShare a Scribd company logo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementing Generate-Test-and-Aggregate
Algorithms on Hadoop
Yu Liu1, Sebastian Fischer2, Kento Emoto3, and Zhenjiang Hu4
1The Graduate University for Advanced Studies
2,4National Institute of Informatics
3University of Tokyo
September 28, 2011
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
MapReduce
Computation in three phases: map, shuffle and reduce
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
Programming with MapReduce
Programmers need to implement the following classes (Hadoop)
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
Programming with MapReduce
The main difficulties of MapReduce Programming :
Nontrivial problems are usually difficult to be computed in a
divide-and-conquer fashion
Efficiency of parallel algorithms is difficult to be obtained
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
Generate Test and Aggregate Algorithm
The Generate-Test-and-Aggregate (GTA for short) algorithm
consists of
generate can generate all possible solution candidates.
test filters the intermediate data.
aggregate computes a summary of valid intermediate data.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
Generate Test and Aggregate Algorithm
The Generate-Test-and-Aggregate (GTA for short) algorithm
consists of
generate can generate all possible solution candidates.
test filters the intermediate data.
aggregate computes a summary of valid intermediate data.
GTA is a very useful and common strategy for a large class of
problems
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
Fill a knapsack with items, each of certain value and weight, such that
the total value of packed items is maximal while adhering to a weight
restriction of the knapsack.
picture from Wikipedia
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
A knapsack program (GTA algorithm):
knapsack = maxvalue ◦ filter ◦ sublists
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
A knapsack program (GTA algorithm):
knapsack = maxvalue ◦ filter ◦ sublists
E.g, there are 3 items: (1kg, $1), (1kg, $2), (2kg, $2)
sublists [(1kg, $1), (1kg, $2), (2kg, $2)]
= [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(1kg, $1), (1kg, $2), (2kg, $2)],
[(1kg, $1), (2kg, $2)], [(1kg, $2)], [(1kg, $2), (2kg, $2)], [(2kg, $2)]
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
A knapsack program (GTA algorithm):
knapsack = maxvalue ◦ filter ◦ sublists
Spouse the capacity of knapsack is 2 kg
filter [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(1kg, $1), (1kg, $2), (2kg, $2)],
[(1kg, $1), (2kg, $2)], [(1kg, $2)], [(1kg, $2), (2kg, $2)], [(2kg, $2)]
= [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(2kg, $2)], [(1kg, $2)]
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
A knapsack program (GTA algorithm):
knapsack = maxvalue ◦ filter ◦ sublists
maxvalue [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(2kg, $2)], [(1kg, $2)]
= $3
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
An Example: Knapsack Problem
A knapsack program (GTA algorithm):
knapsack = maxvalue ◦ filter ◦ sublists
This program is simple but inefficient because it generates
exponential intermediate data (2n).
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
Theorems of Gernerating Efficient Parallel GTA Programs
Efficient parallel programs can be derived from users’
naive but correct programs in terms of a generate, a test, and an
aggregate functions [Emoto et. al., 2011]
aggregate ◦ test ◦ generate ⇒ list homomorphism
List homomorphisms is a class of recursive functions which match very well
with the divide-and-conquer paradigm [Bird, 87; Cole, 95].
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
MapReduce
GTA algorithm
Parallelization of GTA algorithm
The Emoto’s theorem is under the following assumptions:
aggregate is a semiring homomorphism.
test is a list homomorphism.
generate is a polymorphism over semiring structures.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Motivation and Objective
The Emoto’s fusion theorem shows us a possible way to
systematically implement efficient parallel programs with GTA
algorithm
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Motivation and Objective
The Emoto’s fusion theorem shows us a possible way to
systematically implement efficient parallel programs with GTA
algorithm
We need to evaluate this approach by
implementing a practical library, which should
have easy-to-use programming interface help users design
GTA algorithms
be able to generate efficient parallel programs on MapReduce
(Hadoop)
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
System Overview
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Implementation on Hadoop
We implement the following classes:
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementation on Hadoop
MapReducer is an Interface of list homomorphism
h[ ] = id⊕
h[a] = f a
h(x ++ y) = h x ⊕ h y
1 public interface MapReducer<Elem , Val , Res> {
2 public Val identity () ;
3 public Val element ( Elem elem ) ;
4 public Val combine ( Val left , Val right ) ;
5 public Res postprocess ( Val val ) ;
6 }
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementation on Hadoop
MapReducer is an Interface of list homomorphism
Aggregator defines a semiring homomorphism
(A, ⊕, ⊗) → (S, ⊕ , ⊗ )
1 public interface Aggregator<A ,S> {
2 public S zero () ;
3 public S one () ;
4 public S singleton ( A a ) ;
5 public S plus ( S left , S right ) ;
6 public S times ( S left , S right ) ;
7 }
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementation on Hadoop
MapReducer is an Interface of list homomorphism
Aggregator defines a semiring homomorphism
Test is almost list homomorphism, it inherits MapReducer
1 public interface Test<Elem , Key> extends MapReducer<Elem , ←
Key , Boolean> {}
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementation on Hadoop
MapReducer is an Interface of list homomorphism
Aggregator defines a semiring homomorphism
Test inherits MapReducer
Generator implements a MapReducer
polymorphic over semiring: Constructor
filter embedding: embed function return a new generator
1 public abstract class Generator<Elem , Single , Val , Res>
2 implements MapReducer<Elem , Val , Res> {
3 //The c o n t r a c t o r takes an i n s t a n c e of Aggregator
4 public Generator ( Aggregator< Single , Val> aggregator ) { . . . }
5
6 // take an i n s t a n c e of Test and r e t u r n a new i n s t a n c e of Generator
7 public <Key> Generator<Elem , Single , WritableMap<Key , Val>,Res>
8 embed ( final Test<Single , Key> test ) {
9 final Generator<Elem , Single , Val , Res> base = this ;
10 return new Generator<Elem , Single , WritableMap<Key , Val>,Res>
11 ( new Aggregator<Single , WritableMap<Key , Val>>(){ . . . }
12 }
13 public Val process ( List<Elem> list ) { . . . }
14 . . .
15 }
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Implementation on Hadoop
1 Users need to make their own Generator, Test, and Aggregator
by extending/implementing the library provided ones1
2 An instance of Generator will be created at run-time on each
working-node, which is also an efficient list homomorphism
3 The instance list homomorphism can be executed by Hadoop
in parallel
1
Our library provides commonly used Generators and Aggregators.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Java Codes
Let’s have a look at the actual implementation of GTA Knapsack...
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Performance Evaluation
Environment: hardware
We configured clusters with 2, 4, 8, 16, and 32 nodes (virtual
machines). Each computing/data node has one CPU (VM, Xeon
E5530@2.4GHz, 1 core), 3 GB memory.
Test data
102 × 220 (≈ 108) knapsack items (3.2GB)
Each item’s weight is between 0 to 10 and the capacity of the
knapsack is 100.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Evaluation on Hadoop
The Knapsack program scales well when increasing nodes of cluster
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Conclusion
The implementation of GTA library on Hadoop can
hide the technical details of MapReduce(Hadoop)
automatically do parallelization and optimization
generate MapReduce programs which have good scalability
make coding, testing and code-reusing much simpler
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Future Work
Optimization of current framework to gain better performance
Extension of current framework
Other approaches of systematic parallel programming
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Thanks
Questions?
The project is hosted on
http://screwdriver.googlecode.com
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Appendix: The Computation on Semiring
Definition (Semiring)
Given a set S and two binary operations ⊕ and ⊗, the triple (S, ⊕, ⊗) is called a
semiring if and only if
(S, ⊕) is a commutative monoid with identity element id⊕
(S, ⊗) is a monoid with identity element id⊗
⊗ is associative and distributes over ⊕
id⊕ is a zero of ⊗: id⊕ ⊗ a = a ⊗ id⊕ = id⊕
(Int, +, ×) is a semiring, (PositiveInt, +, max) is another semiring
Definition (Semiring homomorphism)
Given two semirings (S, ⊕, ⊗) and (S , ⊕ , ⊗ ), a function hom : S → S is a semiring
homomorphism from (S, ⊕, ⊗) to (S , ⊕ , ⊗ ), iff it is a monoid homomorphism from
(S, ⊕) to (S , ⊕ ) and also a monoid homomorphism from (S, ⊗) to (S , ⊗ ).
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Theorem (Filter-Embedding Fusion)
Given a set A, a finite monoid (M, ), a monoid homomorphism hom from ([A], ++ )
to (M, ), a semiring (S, ⊕, ⊗), a semiring homomorphism aggregate from
( [A] , ×++ ) to (S, ⊕, ⊗), a function ok : M → Bool and a polymorphic semiring
generator generate, the following equation holds:
aggregate ◦ filter(ok ◦ hom)
◦ generate ,x++ (λx → [x] )
= postprocessM ok
◦ generate⊕M ,⊗M
(λx → aggregateM [x] )
The result of fusion is an efficient algorithm in form of a list
homomorphism.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
List Homomorphism
List Homomorphism [Bird, 87; Cole,95] is a class of recursive
functions.
Definition of List Homomorphism
If there is an associative operator , such that for any list x and
list y
h (x ++ y) = h(x) h(y).
Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of .
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
List Homomorphism
List Homomorphism [Bird, 87; Cole,95] is a class of recursive
functions.
Definition of List Homomorphism
If there is an associative operator , such that for any list x and
list y
h (x ++ y) = h(x) h(y).
Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of .
Instance of a list homomorphism
sum [a] = a
sum (x ++ y) = sum x + sum y.
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
List Homomorphism
List Homomorphism [Bird, 87; Cole,95] is a class of recursive
functions.
Definition of List Homomorphism
If there is an associative operator , such that for any list x and
list y
h (x ++ y) = h(x) h(y).
Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of .
A list homomorphism can be automatically parallelized by
MapReduce [Yu et. al., EuroPar11].
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
Background
Motivation and Objective
Design and implementation
Performance test
Conclusion and future work
Evaluation on Hadoop
We test 3.2GB data on {2 , 4, 8, 16, 32} nodes clusters and 32
GB data on {32, 64} nodes clusters
2 nodes 4 nodes 8 nodes 16 nodes 32 nodes 64 nodes
time(sec.) 1602 882 482 317 961 511
speedup – × 1.82 × 1.83 × 1.52 – × 1.88
Yu Liu1
, Sebastian Fischer2
, Kento Emoto3
, and Zhenjiang Hu4
Implementing Generate-Test-and-Aggregate Algorithms on Hadoo

More Related Content

What's hot

Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2
maikroeder
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
ijdpsjournal
 
Inventory theory presentation
Inventory theory presentationInventory theory presentation
Inventory theory presentation
kun shin
 
Clustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining ofClustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining of
ijfcstjournal
 
J0945761
J0945761J0945761
J0945761
IOSR Journals
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
izahn
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
Mohammad
 
Daa unit 5
Daa unit 5Daa unit 5
Daa unit 5
Abhimanyu Mishra
 
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHMA PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
csandit
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
Sebastian Ruder
 
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
Ruairi de Frein
 
Dual-time Modeling and Forecasting in Consumer Banking (2016)
Dual-time Modeling and Forecasting in Consumer Banking (2016)Dual-time Modeling and Forecasting in Consumer Banking (2016)
Dual-time Modeling and Forecasting in Consumer Banking (2016)
Aijun Zhang
 
40120130405025
4012013040502540120130405025
40120130405025
IAEME Publication
 
ANSSummer2015
ANSSummer2015ANSSummer2015
ANSSummer2015
Mohammad Abdo
 
Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...
Willy Marroquin (WillyDevNET)
 
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
The Statistical and Applied Mathematical Sciences Institute
 
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
The Statistical and Applied Mathematical Sciences Institute
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demo
Hridyesh Bisht
 
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMSTHE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
ijfcstjournal
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methods
Ronald Teo
 

What's hot (20)

Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
 
Inventory theory presentation
Inventory theory presentationInventory theory presentation
Inventory theory presentation
 
Clustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining ofClustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining of
 
J0945761
J0945761J0945761
J0945761
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
 
Daa unit 5
Daa unit 5Daa unit 5
Daa unit 5
 
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHMA PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
A PREFIXED-ITEMSET-BASED IMPROVEMENT FOR APRIORI ALGORITHM
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
 
Dual-time Modeling and Forecasting in Consumer Banking (2016)
Dual-time Modeling and Forecasting in Consumer Banking (2016)Dual-time Modeling and Forecasting in Consumer Banking (2016)
Dual-time Modeling and Forecasting in Consumer Banking (2016)
 
40120130405025
4012013040502540120130405025
40120130405025
 
ANSSummer2015
ANSSummer2015ANSSummer2015
ANSSummer2015
 
Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...Scalable trust-region method for deep reinforcement learning using Kronecker-...
Scalable trust-region method for deep reinforcement learning using Kronecker-...
 
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
 
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demo
 
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMSTHE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
THE NEW HYBRID COAW METHOD FOR SOLVING MULTI-OBJECTIVE PROBLEMS
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methods
 

Viewers also liked

25 de junio_2015_ciro_gomez_leyva
25 de junio_2015_ciro_gomez_leyva25 de junio_2015_ciro_gomez_leyva
25 de junio_2015_ciro_gomez_leyva
Odin De Los Rios
 
Slump cone test
Slump cone testSlump cone test
Slump cone test
Shahabaz Banu
 
Mechanical principles and applications pres
Mechanical principles and applications presMechanical principles and applications pres
Mechanical principles and applications pres
Dr Andrew Kimmance PhD, MSc, BSc, MAPM, CIOB, AHEA
 
Aggregate impact value Calculation And uses
Aggregate impact value Calculation And usesAggregate impact value Calculation And uses
Aggregate impact value Calculation And uses
Shahryar Amin
 
Viscosity test of bitumen
Viscosity test of bitumenViscosity test of bitumen
Viscosity test of bitumen
Amrit pandit
 
Aggregate impact value test
Aggregate impact value testAggregate impact value test
Aggregate impact value test
Adarsh Shukla
 
Indian Highways Road signs
Indian Highways Road signsIndian Highways Road signs
Indian Highways Road signs
debasish
 
Concrete mix design by k r thanki
Concrete mix design by k r thankiConcrete mix design by k r thanki
Concrete mix design by k r thanki
Krunal Thanki
 
Class 1 Moisture Content - Specific Gravity ( Geotechnical Engineering )
Class 1  Moisture Content - Specific Gravity ( Geotechnical Engineering )Class 1  Moisture Content - Specific Gravity ( Geotechnical Engineering )
Class 1 Moisture Content - Specific Gravity ( Geotechnical Engineering )
Hossam Shafiq I
 
Traffic signal 32&35:DCE:FET:IIUI
Traffic signal 32&35:DCE:FET:IIUITraffic signal 32&35:DCE:FET:IIUI
Traffic signal 32&35:DCE:FET:IIUI
civilengineerf14
 
Class 3 (b) Soil Classification ( Geotechnical Engineering )
Class 3 (b)    Soil Classification ( Geotechnical Engineering )Class 3 (b)    Soil Classification ( Geotechnical Engineering )
Class 3 (b) Soil Classification ( Geotechnical Engineering )
Hossam Shafiq I
 
A PowerPoint Presentation On Superstructure
A PowerPoint Presentation On SuperstructureA PowerPoint Presentation On Superstructure
A PowerPoint Presentation On Superstructure
kuntansourav
 
Introduction of system of coplanar forces (engineering mechanics)
Introduction of system of coplanar forces (engineering mechanics)Introduction of system of coplanar forces (engineering mechanics)
Introduction of system of coplanar forces (engineering mechanics)
mashnil Gaddapawar
 
Friction
FrictionFriction
Friction
lavadoods Masta
 
Quality Testing of Drinking Water
Quality Testing of Drinking WaterQuality Testing of Drinking Water
Quality Testing of Drinking Water
bill16388
 
Superstructure construction
Superstructure constructionSuperstructure construction
Superstructure construction
mohdasrimohdhasim
 
Types of forces
Types of forcesTypes of forces
Types of forces
Kunal Yadav
 
Water quality and sampling
Water quality and samplingWater quality and sampling
Water quality and sampling
Jasmine John
 
Ppt Friction
Ppt FrictionPpt Friction
Ppt Friction
ffiala
 
Fresh concrete properties & its standard tests
Fresh concrete properties & its standard testsFresh concrete properties & its standard tests
Fresh concrete properties & its standard tests
MaHmoud AliraQi
 

Viewers also liked (20)

25 de junio_2015_ciro_gomez_leyva
25 de junio_2015_ciro_gomez_leyva25 de junio_2015_ciro_gomez_leyva
25 de junio_2015_ciro_gomez_leyva
 
Slump cone test
Slump cone testSlump cone test
Slump cone test
 
Mechanical principles and applications pres
Mechanical principles and applications presMechanical principles and applications pres
Mechanical principles and applications pres
 
Aggregate impact value Calculation And uses
Aggregate impact value Calculation And usesAggregate impact value Calculation And uses
Aggregate impact value Calculation And uses
 
Viscosity test of bitumen
Viscosity test of bitumenViscosity test of bitumen
Viscosity test of bitumen
 
Aggregate impact value test
Aggregate impact value testAggregate impact value test
Aggregate impact value test
 
Indian Highways Road signs
Indian Highways Road signsIndian Highways Road signs
Indian Highways Road signs
 
Concrete mix design by k r thanki
Concrete mix design by k r thankiConcrete mix design by k r thanki
Concrete mix design by k r thanki
 
Class 1 Moisture Content - Specific Gravity ( Geotechnical Engineering )
Class 1  Moisture Content - Specific Gravity ( Geotechnical Engineering )Class 1  Moisture Content - Specific Gravity ( Geotechnical Engineering )
Class 1 Moisture Content - Specific Gravity ( Geotechnical Engineering )
 
Traffic signal 32&35:DCE:FET:IIUI
Traffic signal 32&35:DCE:FET:IIUITraffic signal 32&35:DCE:FET:IIUI
Traffic signal 32&35:DCE:FET:IIUI
 
Class 3 (b) Soil Classification ( Geotechnical Engineering )
Class 3 (b)    Soil Classification ( Geotechnical Engineering )Class 3 (b)    Soil Classification ( Geotechnical Engineering )
Class 3 (b) Soil Classification ( Geotechnical Engineering )
 
A PowerPoint Presentation On Superstructure
A PowerPoint Presentation On SuperstructureA PowerPoint Presentation On Superstructure
A PowerPoint Presentation On Superstructure
 
Introduction of system of coplanar forces (engineering mechanics)
Introduction of system of coplanar forces (engineering mechanics)Introduction of system of coplanar forces (engineering mechanics)
Introduction of system of coplanar forces (engineering mechanics)
 
Friction
FrictionFriction
Friction
 
Quality Testing of Drinking Water
Quality Testing of Drinking WaterQuality Testing of Drinking Water
Quality Testing of Drinking Water
 
Superstructure construction
Superstructure constructionSuperstructure construction
Superstructure construction
 
Types of forces
Types of forcesTypes of forces
Types of forces
 
Water quality and sampling
Water quality and samplingWater quality and sampling
Water quality and sampling
 
Ppt Friction
Ppt FrictionPpt Friction
Ppt Friction
 
Fresh concrete properties & its standard tests
Fresh concrete properties & its standard testsFresh concrete properties & its standard tests
Fresh concrete properties & its standard tests
 

Similar to Implementing Generate-Test-and-Aggregate Algorithms on Hadoop

Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
Aijun Zhang
 
A Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on SparkA Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on Spark
Yu Liu
 
IJCSI-2015-12-2-10138 (1) (2)
IJCSI-2015-12-2-10138 (1) (2)IJCSI-2015-12-2-10138 (1) (2)
IJCSI-2015-12-2-10138 (1) (2)
Dr Muhannad Al-Hasan
 
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
IAEME Publication
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
IAEME Publication
 
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
ijcsa
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
Ryo Iwaki
 
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
paperpublications3
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
Young-Geun Choi
 
Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2
gregoryg
 
Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...
Ali Shahed
 
A Tabu Search Heuristic For The Generalized Assignment Problem
A Tabu Search Heuristic For The Generalized Assignment ProblemA Tabu Search Heuristic For The Generalized Assignment Problem
A Tabu Search Heuristic For The Generalized Assignment Problem
Sandra Long
 
The New Hybrid COAW Method for Solving Multi-Objective Problems
The New Hybrid COAW Method for Solving Multi-Objective ProblemsThe New Hybrid COAW Method for Solving Multi-Objective Problems
The New Hybrid COAW Method for Solving Multi-Objective Problems
ijfcstjournal
 
Two-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential EvolutionTwo-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential Evolution
Xin-She Yang
 
Accelerated life testing
Accelerated life testingAccelerated life testing
Accelerated life testing
Steven Li
 
SigOpt_Bayesian_Optimization_Primer
SigOpt_Bayesian_Optimization_PrimerSigOpt_Bayesian_Optimization_Primer
SigOpt_Bayesian_Optimization_Primer
Ian Dewancker
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
irjes
 
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
130321   zephyrin soh - on the effect of exploration strategies on maintenanc...130321   zephyrin soh - on the effect of exploration strategies on maintenanc...
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
Ptidej Team
 
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
Yu Liu
 
ICSE10b.ppt
ICSE10b.pptICSE10b.ppt
ICSE10b.ppt
Ptidej Team
 

Similar to Implementing Generate-Test-and-Aggregate Algorithms on Hadoop (20)

Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
A Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on SparkA Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on Spark
 
IJCSI-2015-12-2-10138 (1) (2)
IJCSI-2015-12-2-10138 (1) (2)IJCSI-2015-12-2-10138 (1) (2)
IJCSI-2015-12-2-10138 (1) (2)
 
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
 
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
Performance Analysis of Genetic Algorithm as a Stochastic Optimization Tool i...
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2Boetticher Presentation Promise 2008v2
Boetticher Presentation Promise 2008v2
 
Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...
 
A Tabu Search Heuristic For The Generalized Assignment Problem
A Tabu Search Heuristic For The Generalized Assignment ProblemA Tabu Search Heuristic For The Generalized Assignment Problem
A Tabu Search Heuristic For The Generalized Assignment Problem
 
The New Hybrid COAW Method for Solving Multi-Objective Problems
The New Hybrid COAW Method for Solving Multi-Objective ProblemsThe New Hybrid COAW Method for Solving Multi-Objective Problems
The New Hybrid COAW Method for Solving Multi-Objective Problems
 
Two-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential EvolutionTwo-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential Evolution
 
Accelerated life testing
Accelerated life testingAccelerated life testing
Accelerated life testing
 
SigOpt_Bayesian_Optimization_Primer
SigOpt_Bayesian_Optimization_PrimerSigOpt_Bayesian_Optimization_Primer
SigOpt_Bayesian_Optimization_Primer
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
130321   zephyrin soh - on the effect of exploration strategies on maintenanc...130321   zephyrin soh - on the effect of exploration strategies on maintenanc...
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
 
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
 
ICSE10b.ppt
ICSE10b.pptICSE10b.ppt
ICSE10b.ppt
 

More from Yu Liu

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
Yu Liu
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Yu Liu
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDoc
Yu Liu
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
Yu Liu
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search Engines
Yu Liu
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Yu Liu
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Yu Liu
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013
Yu Liu
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)
Yu Liu
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)
Yu Liu
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming Framework
Yu Liu
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
Yu Liu
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Yu Liu
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
Yu Liu
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce world
Yu Liu
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
Yu Liu
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
Yu Liu
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
Yu Liu
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
Yu Liu
 
Towards Systematic Parallel Programming over MapReduce
Towards Systematic Parallel Programming over MapReduceTowards Systematic Parallel Programming over MapReduce
Towards Systematic Parallel Programming over MapReduce
Yu Liu
 

More from Yu Liu (20)

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDoc
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search Engines
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming Framework
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce world
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
 
Towards Systematic Parallel Programming over MapReduce
Towards Systematic Parallel Programming over MapReduceTowards Systematic Parallel Programming over MapReduce
Towards Systematic Parallel Programming over MapReduce
 

Recently uploaded

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

Implementing Generate-Test-and-Aggregate Algorithms on Hadoop

  • 1. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementing Generate-Test-and-Aggregate Algorithms on Hadoop Yu Liu1, Sebastian Fischer2, Kento Emoto3, and Zhenjiang Hu4 1The Graduate University for Advanced Studies 2,4National Institute of Informatics 3University of Tokyo September 28, 2011 Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 2. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm MapReduce Computation in three phases: map, shuffle and reduce Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 3. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm Programming with MapReduce Programmers need to implement the following classes (Hadoop) Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 4. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm Programming with MapReduce The main difficulties of MapReduce Programming : Nontrivial problems are usually difficult to be computed in a divide-and-conquer fashion Efficiency of parallel algorithms is difficult to be obtained Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 5. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm Generate Test and Aggregate Algorithm The Generate-Test-and-Aggregate (GTA for short) algorithm consists of generate can generate all possible solution candidates. test filters the intermediate data. aggregate computes a summary of valid intermediate data. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 6. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm Generate Test and Aggregate Algorithm The Generate-Test-and-Aggregate (GTA for short) algorithm consists of generate can generate all possible solution candidates. test filters the intermediate data. aggregate computes a summary of valid intermediate data. GTA is a very useful and common strategy for a large class of problems Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 7. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem Fill a knapsack with items, each of certain value and weight, such that the total value of packed items is maximal while adhering to a weight restriction of the knapsack. picture from Wikipedia Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 8. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem A knapsack program (GTA algorithm): knapsack = maxvalue ◦ filter ◦ sublists Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 9. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem A knapsack program (GTA algorithm): knapsack = maxvalue ◦ filter ◦ sublists E.g, there are 3 items: (1kg, $1), (1kg, $2), (2kg, $2) sublists [(1kg, $1), (1kg, $2), (2kg, $2)] = [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(1kg, $1), (1kg, $2), (2kg, $2)], [(1kg, $1), (2kg, $2)], [(1kg, $2)], [(1kg, $2), (2kg, $2)], [(2kg, $2)] Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 10. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem A knapsack program (GTA algorithm): knapsack = maxvalue ◦ filter ◦ sublists Spouse the capacity of knapsack is 2 kg filter [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(1kg, $1), (1kg, $2), (2kg, $2)], [(1kg, $1), (2kg, $2)], [(1kg, $2)], [(1kg, $2), (2kg, $2)], [(2kg, $2)] = [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(2kg, $2)], [(1kg, $2)] Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 11. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem A knapsack program (GTA algorithm): knapsack = maxvalue ◦ filter ◦ sublists maxvalue [ ], [(1kg, $1)], [(1kg, $1), (1kg, $2)], [(2kg, $2)], [(1kg, $2)] = $3 Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 12. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm An Example: Knapsack Problem A knapsack program (GTA algorithm): knapsack = maxvalue ◦ filter ◦ sublists This program is simple but inefficient because it generates exponential intermediate data (2n). Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 13. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm Theorems of Gernerating Efficient Parallel GTA Programs Efficient parallel programs can be derived from users’ naive but correct programs in terms of a generate, a test, and an aggregate functions [Emoto et. al., 2011] aggregate ◦ test ◦ generate ⇒ list homomorphism List homomorphisms is a class of recursive functions which match very well with the divide-and-conquer paradigm [Bird, 87; Cole, 95]. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 14. Background Motivation and Objective Design and implementation Performance test Conclusion and future work MapReduce GTA algorithm Parallelization of GTA algorithm The Emoto’s theorem is under the following assumptions: aggregate is a semiring homomorphism. test is a list homomorphism. generate is a polymorphism over semiring structures. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 15. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Motivation and Objective The Emoto’s fusion theorem shows us a possible way to systematically implement efficient parallel programs with GTA algorithm Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 16. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Motivation and Objective The Emoto’s fusion theorem shows us a possible way to systematically implement efficient parallel programs with GTA algorithm We need to evaluate this approach by implementing a practical library, which should have easy-to-use programming interface help users design GTA algorithms be able to generate efficient parallel programs on MapReduce (Hadoop) Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 17. Background Motivation and Objective Design and implementation Performance test Conclusion and future work System Overview Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 18. Implementation on Hadoop We implement the following classes:
  • 19. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementation on Hadoop MapReducer is an Interface of list homomorphism h[ ] = id⊕ h[a] = f a h(x ++ y) = h x ⊕ h y 1 public interface MapReducer<Elem , Val , Res> { 2 public Val identity () ; 3 public Val element ( Elem elem ) ; 4 public Val combine ( Val left , Val right ) ; 5 public Res postprocess ( Val val ) ; 6 } Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 20. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementation on Hadoop MapReducer is an Interface of list homomorphism Aggregator defines a semiring homomorphism (A, ⊕, ⊗) → (S, ⊕ , ⊗ ) 1 public interface Aggregator<A ,S> { 2 public S zero () ; 3 public S one () ; 4 public S singleton ( A a ) ; 5 public S plus ( S left , S right ) ; 6 public S times ( S left , S right ) ; 7 } Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 21. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementation on Hadoop MapReducer is an Interface of list homomorphism Aggregator defines a semiring homomorphism Test is almost list homomorphism, it inherits MapReducer 1 public interface Test<Elem , Key> extends MapReducer<Elem , ← Key , Boolean> {} Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 22. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementation on Hadoop MapReducer is an Interface of list homomorphism Aggregator defines a semiring homomorphism Test inherits MapReducer Generator implements a MapReducer polymorphic over semiring: Constructor filter embedding: embed function return a new generator 1 public abstract class Generator<Elem , Single , Val , Res> 2 implements MapReducer<Elem , Val , Res> { 3 //The c o n t r a c t o r takes an i n s t a n c e of Aggregator 4 public Generator ( Aggregator< Single , Val> aggregator ) { . . . } 5 6 // take an i n s t a n c e of Test and r e t u r n a new i n s t a n c e of Generator 7 public <Key> Generator<Elem , Single , WritableMap<Key , Val>,Res> 8 embed ( final Test<Single , Key> test ) { 9 final Generator<Elem , Single , Val , Res> base = this ; 10 return new Generator<Elem , Single , WritableMap<Key , Val>,Res> 11 ( new Aggregator<Single , WritableMap<Key , Val>>(){ . . . } 12 } 13 public Val process ( List<Elem> list ) { . . . } 14 . . . 15 } Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 23. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Implementation on Hadoop 1 Users need to make their own Generator, Test, and Aggregator by extending/implementing the library provided ones1 2 An instance of Generator will be created at run-time on each working-node, which is also an efficient list homomorphism 3 The instance list homomorphism can be executed by Hadoop in parallel 1 Our library provides commonly used Generators and Aggregators. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 24. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Java Codes Let’s have a look at the actual implementation of GTA Knapsack... Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 25. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Performance Evaluation Environment: hardware We configured clusters with 2, 4, 8, 16, and 32 nodes (virtual machines). Each computing/data node has one CPU (VM, Xeon E5530@2.4GHz, 1 core), 3 GB memory. Test data 102 × 220 (≈ 108) knapsack items (3.2GB) Each item’s weight is between 0 to 10 and the capacity of the knapsack is 100. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 26. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Evaluation on Hadoop The Knapsack program scales well when increasing nodes of cluster Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 27. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Conclusion The implementation of GTA library on Hadoop can hide the technical details of MapReduce(Hadoop) automatically do parallelization and optimization generate MapReduce programs which have good scalability make coding, testing and code-reusing much simpler Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 28. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Future Work Optimization of current framework to gain better performance Extension of current framework Other approaches of systematic parallel programming Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 29. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Thanks Questions? The project is hosted on http://screwdriver.googlecode.com Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 30. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Appendix: The Computation on Semiring Definition (Semiring) Given a set S and two binary operations ⊕ and ⊗, the triple (S, ⊕, ⊗) is called a semiring if and only if (S, ⊕) is a commutative monoid with identity element id⊕ (S, ⊗) is a monoid with identity element id⊗ ⊗ is associative and distributes over ⊕ id⊕ is a zero of ⊗: id⊕ ⊗ a = a ⊗ id⊕ = id⊕ (Int, +, ×) is a semiring, (PositiveInt, +, max) is another semiring Definition (Semiring homomorphism) Given two semirings (S, ⊕, ⊗) and (S , ⊕ , ⊗ ), a function hom : S → S is a semiring homomorphism from (S, ⊕, ⊗) to (S , ⊕ , ⊗ ), iff it is a monoid homomorphism from (S, ⊕) to (S , ⊕ ) and also a monoid homomorphism from (S, ⊗) to (S , ⊗ ). Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 31. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Theorem (Filter-Embedding Fusion) Given a set A, a finite monoid (M, ), a monoid homomorphism hom from ([A], ++ ) to (M, ), a semiring (S, ⊕, ⊗), a semiring homomorphism aggregate from ( [A] , ×++ ) to (S, ⊕, ⊗), a function ok : M → Bool and a polymorphic semiring generator generate, the following equation holds: aggregate ◦ filter(ok ◦ hom) ◦ generate ,x++ (λx → [x] ) = postprocessM ok ◦ generate⊕M ,⊗M (λx → aggregateM [x] ) The result of fusion is an efficient algorithm in form of a list homomorphism. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 32. Background Motivation and Objective Design and implementation Performance test Conclusion and future work List Homomorphism List Homomorphism [Bird, 87; Cole,95] is a class of recursive functions. Definition of List Homomorphism If there is an associative operator , such that for any list x and list y h (x ++ y) = h(x) h(y). Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of . Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 33. Background Motivation and Objective Design and implementation Performance test Conclusion and future work List Homomorphism List Homomorphism [Bird, 87; Cole,95] is a class of recursive functions. Definition of List Homomorphism If there is an associative operator , such that for any list x and list y h (x ++ y) = h(x) h(y). Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of . Instance of a list homomorphism sum [a] = a sum (x ++ y) = sum x + sum y. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 34. Background Motivation and Objective Design and implementation Performance test Conclusion and future work List Homomorphism List Homomorphism [Bird, 87; Cole,95] is a class of recursive functions. Definition of List Homomorphism If there is an associative operator , such that for any list x and list y h (x ++ y) = h(x) h(y). Where ++ is the list concatenation and h [a] = f a, h(x) id = h(x), id is an identity element of . A list homomorphism can be automatically parallelized by MapReduce [Yu et. al., EuroPar11]. Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo
  • 35. Background Motivation and Objective Design and implementation Performance test Conclusion and future work Evaluation on Hadoop We test 3.2GB data on {2 , 4, 8, 16, 32} nodes clusters and 32 GB data on {32, 64} nodes clusters 2 nodes 4 nodes 8 nodes 16 nodes 32 nodes 64 nodes time(sec.) 1602 882 482 317 961 511 speedup – × 1.82 × 1.83 × 1.52 – × 1.88 Yu Liu1 , Sebastian Fischer2 , Kento Emoto3 , and Zhenjiang Hu4 Implementing Generate-Test-and-Aggregate Algorithms on Hadoo