SlideShare a Scribd company logo
1 of 28
Download to read offline
A Generate-Test-Aggregate
Parallel Programming Library
Yu Liu1, Kento Emoto2, Zhenjiang Hu3
1The Graduate University for Advanced Studies
2The University of Tokyo
3National Institute of Informatics
PPoPP PMAM 2013
Systematic Parallel Programming for MapReduce
Outline
Introduction to GTA
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
Outline
Introduction to GTA
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
The GTA Programming Methodology
 Simple programming pattern
1. Generate all possible solution candidates;
2. Test and filter candidates;
3. Aggregate the valid candidates.
 Expressive and code efficient
 Covers a large class of problems
 Automatic optimization and parallelization
~ Kento Emoto, et.al., [ESOP’12]
An Example: The Knapsack Problem
Writing a parallel (MapReduce) program for the
knapsack problem is not easy.
Picture from Wikipedia
input: [ (1 $, 2 Kg), (2 $, 6 Kg), (3 $, 10 Kg) ]
weight limitation =15
generate:
[ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$,
6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ], [(2$, 6 Kg) , (3 $, 10 Kg) ],
[(1$, 2 Kg) , (2$, 6 Kg) , (3 $, 10 Kg) ] ]
test: [true, true, true, true, true, false, false]
filter: [ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ],
[(1$, 2 Kg) , (2$, 6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ] ]
aggregate: 0$, 1$, 2 $, 3$, 3$, 4$
Naively implementing Knapsack is inefficient (O(2n)).
Input (length) Time (ms)
8 30
12 86
16 97
20 2829
24
java.lang.OutOfMemoryError: Java heap
space
performance of the naïve Knapsack program
The GTA fusion theorem is introduced for resolve
efficiency problem
GTA Fusion
mapReduce
able
predicates
generator
aggregator
map ( mapReduceable.f ) .
reduce ( mapReduceable.combine )
MapReduce
Definitions of G,T,A
Class Name Algebraic Structure
Generator polymorphic semiring
generator
Predicate almost list
homomorphism
Aggregator semiring homomorphism
Ref: K.Emoto [ESOP’12]
Main Contributions
The implementation of a GTA library
 A simple and statically typed GTA-DSL is
implemented
 Algebraic structures and
computations/transformations of them are
implemented
Evaluation of GTA methodology
Outline
GTA programming methodology
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
Object-oriented Functional Style
We defined the basic
algebraic structures.
Relations/transformations
of the algebras are well
typed
Examples
Outline
GTA programming methodology
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
The users write GTA expressions like:
generate(g:GEN) filter(t:Predicate)* aggregate(a:Aggregator)
G‧T‧A Programming DSL
GEN, Aggregator, Predicate are Scala traits defined in the GTA library
Outline
GTA programming methodology
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
GTA-fusion
G+A+T 𝑀𝑎𝑝𝑅𝑒𝑑𝑢𝑐𝑒𝑎𝑏𝑙𝑒[𝑓,⊕]
Input x1, x2, x3, … , xn
MAP
REDUCE
table1 tablen
f f f f
…
table1 tablentable2 ⊕ ⊕⊕ …
[EuroPar’11]
Implementation of GTA
Fusion/Optimization
The main difficulties:
How to define a polymorphic generator
How to define a predicate for test
How to define intermediate data structures
and other algebraic structures
Outline
GTA programming methodology
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
More Examples
More examples in the paper and source package:
 Extended Knapsack problems
 The maximum-segments-sum problem
 Finding the most possible sequence (viterbi algorithm)
More information on: https://bitbucket.org/inii/gtalib
G‧T‧A Building Blocks
Our library provides commonly used G·T·A building
blocks and users can also implement their own G,T,As.
Performance Evaluations
Evaluations on EdubaseCluster (Cloud)
– Up to 32 VM nodes, each has 3GB RAM, 1 single
core CPU
– Executed on Spark – an in-memory MR cluster
Execution Time (Knapsack)
203.63
92.83 64.64 47.76 37.06 29.78 25.17 23.25
1727.973
679.305
637.33
471.2
362.36
287.08
234.25 223.44
0
200
400
600
800
1000
1200
1400
1600
1800
4 8 12 16 20 24 28 32
Time(second)
Number of VM nodes
1.00E+07 items
1.00E+08 items
Linear Speedup
0
1
2
3
4
5
6
7
8
9
4 8 12 16 20 24 28 32
speedup
number of VM
Knapsack
ViterbiAlg
MSS
Outline
GTA programming methodology
The GTA library
 Implementation strategy
 Programming interface
 Automatic parallelization and optimization
Applications and evaluations
Conclusions
Conclusions
We show GTA can be efficiently implemented
GTA-DSL can simplify parallel programming
 Simple programming model
 Good code efficiency
GTA-DSL is architecture independent
Future Works
Enrich the library by more building blocks in
terms of G, T, A
GTA-DSL can be extended to processing more
complex data structures such as tree/graph
Q&A
Thank you very much!

More Related Content

What's hot

強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷Eiji Sekiya
 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitShiladitya Sen
 
pMatlab on BlueGene
pMatlab on BlueGenepMatlab on BlueGene
pMatlab on BlueGenevsachde
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)Hansol Kang
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...NECST Lab @ Politecnico di Milano
 
virtualization
virtualizationvirtualization
virtualizationAvi Nash
 
Graph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionGraph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionAnton Korzh
 
Towards quantum machine learning calogero zarbo - meet up
Towards quantum machine learning  calogero zarbo - meet upTowards quantum machine learning  calogero zarbo - meet up
Towards quantum machine learning calogero zarbo - meet upDeep Learning Italia
 
Observations on dag scheduling and dynamic load-balancing using genetic algor...
Observations on dag scheduling and dynamic load-balancing using genetic algor...Observations on dag scheduling and dynamic load-balancing using genetic algor...
Observations on dag scheduling and dynamic load-balancing using genetic algor...Rahul Jain
 
Kobeworkshop pubchemqc project
Kobeworkshop pubchemqc projectKobeworkshop pubchemqc project
Kobeworkshop pubchemqc projectMaho Nakata
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex PerrierAlexis Perrier
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...Ganesan Narayanasamy
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionKaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionPreferred Networks
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascaleinside-BigData.com
 
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...Andrea Tassi
 
Early Application experiences on Summit
Early Application experiences on Summit Early Application experiences on Summit
Early Application experiences on Summit Ganesan Narayanasamy
 

What's hot (20)

強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal Wabbit
 
pMatlab on BlueGene
pMatlab on BlueGenepMatlab on BlueGene
pMatlab on BlueGene
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
virtualization
virtualizationvirtualization
virtualization
 
Graph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized versionGraph 500 DISLIB powered optimized version
Graph 500 DISLIB powered optimized version
 
Towards quantum machine learning calogero zarbo - meet up
Towards quantum machine learning  calogero zarbo - meet upTowards quantum machine learning  calogero zarbo - meet up
Towards quantum machine learning calogero zarbo - meet up
 
Observations on dag scheduling and dynamic load-balancing using genetic algor...
Observations on dag scheduling and dynamic load-balancing using genetic algor...Observations on dag scheduling and dynamic load-balancing using genetic algor...
Observations on dag scheduling and dynamic load-balancing using genetic algor...
 
Kobeworkshop pubchemqc project
Kobeworkshop pubchemqc projectKobeworkshop pubchemqc project
Kobeworkshop pubchemqc project
 
Europy17_dibernardo
Europy17_dibernardoEuropy17_dibernardo
Europy17_dibernardo
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionKaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
 
Achitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and ExascaleAchitecture Aware Algorithms and Software for Peta and Exascale
Achitecture Aware Algorithms and Software for Peta and Exascale
 
Introduction to glpk
Introduction to glpkIntroduction to glpk
Introduction to glpk
 
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...
Sleep Period Optimization Model For Layered Video Service Delivery Over eMBMS...
 
Early Application experiences on Summit
Early Application experiences on Summit Early Application experiences on Summit
Early Application experiences on Summit
 

Viewers also liked

Fineness of fine aggregates perfect (1)
Fineness of fine aggregates perfect  (1)Fineness of fine aggregates perfect  (1)
Fineness of fine aggregates perfect (1)Dave Madhav
 
Use of waste plastic in road construction
Use of waste plastic in road construction Use of waste plastic in road construction
Use of waste plastic in road construction rajatsikarwar
 
Analysis of properties of plastic coated aggregate for construction ...
Analysis  of  properties of   plastic  coated  aggregate  for   construction ...Analysis  of  properties of   plastic  coated  aggregate  for   construction ...
Analysis of properties of plastic coated aggregate for construction ...Daanish Zama
 
Design and construction of highway (flexible pavement
Design and construction of highway (flexible pavementDesign and construction of highway (flexible pavement
Design and construction of highway (flexible pavementsuraj vishwakarma
 
Construction of flexible pavements
Construction of flexible pavementsConstruction of flexible pavements
Construction of flexible pavementspradip dangar
 
POLYMER MODIFIED BITUMEN
POLYMER MODIFIED BITUMENPOLYMER MODIFIED BITUMEN
POLYMER MODIFIED BITUMENNitesh Jha
 
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-construction
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-constructionUtilisation of-waste-plastic-in-bituminous-mixes-for-road-construction
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-constructionAHMED NADIM JILANI
 

Viewers also liked (10)

Fineness of fine aggregates perfect (1)
Fineness of fine aggregates perfect  (1)Fineness of fine aggregates perfect  (1)
Fineness of fine aggregates perfect (1)
 
Use of waste plastic in road construction
Use of waste plastic in road construction Use of waste plastic in road construction
Use of waste plastic in road construction
 
Analysis of properties of plastic coated aggregate for construction ...
Analysis  of  properties of   plastic  coated  aggregate  for   construction ...Analysis  of  properties of   plastic  coated  aggregate  for   construction ...
Analysis of properties of plastic coated aggregate for construction ...
 
Sitaram1
Sitaram1Sitaram1
Sitaram1
 
Design and construction of highway (flexible pavement
Design and construction of highway (flexible pavementDesign and construction of highway (flexible pavement
Design and construction of highway (flexible pavement
 
Construction of flexible pavements
Construction of flexible pavementsConstruction of flexible pavements
Construction of flexible pavements
 
POLYMER MODIFIED BITUMEN
POLYMER MODIFIED BITUMENPOLYMER MODIFIED BITUMEN
POLYMER MODIFIED BITUMEN
 
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-construction
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-constructionUtilisation of-waste-plastic-in-bituminous-mixes-for-road-construction
Utilisation of-waste-plastic-in-bituminous-mixes-for-road-construction
 
Concrete
ConcreteConcrete
Concrete
 
Highway Materials Unit-III
Highway Materials  Unit-IIIHighway Materials  Unit-III
Highway Materials Unit-III
 

Similar to A Generate-Test-Aggregate Parallel Programming Library on Spark

Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopYu Liu
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAijun Zhang
 
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...Voica Gavrilut
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxTier1 app
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentationNoha Elprince
 
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...Yusuke Izawa
 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceKyong-Ha Lee
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020Jelastic Multi-Cloud PaaS
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaFerdinand Jamitzky
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Orthogonal Matching Pursuit in 2D for Java with GPGPU Prospectives
Orthogonal Matching Pursuit in 2D for Java with GPGPU ProspectivesOrthogonal Matching Pursuit in 2D for Java with GPGPU Prospectives
Orthogonal Matching Pursuit in 2D for Java with GPGPU ProspectivesMatt Simons
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningCarol McDonald
 
Javascript TDD with Jasmine, Karma, and Gulp
Javascript TDD with Jasmine, Karma, and GulpJavascript TDD with Jasmine, Karma, and Gulp
Javascript TDD with Jasmine, Karma, and GulpAll Things Open
 
What is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionWhat is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionIgalia
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Thang Nguyen
 
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...scalaconfjp
 

Similar to A Generate-Test-Aggregate Parallel Programming Library on Spark (20)

Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptx
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentation
 
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020
State of Java Elasticity. Tuning Java Efficiency - GIDS.JAVA LIVE 2020
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cuda
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Orthogonal Matching Pursuit in 2D for Java with GPGPU Prospectives
Orthogonal Matching Pursuit in 2D for Java with GPGPU ProspectivesOrthogonal Matching Pursuit in 2D for Java with GPGPU Prospectives
Orthogonal Matching Pursuit in 2D for Java with GPGPU Prospectives
 
Collections forceawakens
Collections forceawakensCollections forceawakens
Collections forceawakens
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
Js on-microcontrollers
Js on-microcontrollersJs on-microcontrollers
Js on-microcontrollers
 
Javascript TDD with Jasmine, Karma, and Gulp
Javascript TDD with Jasmine, Karma, and GulpJavascript TDD with Jasmine, Karma, and Gulp
Javascript TDD with Jasmine, Karma, and Gulp
 
What is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionWhat is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 edition
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
 
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...
Scarab: SAT-based Constraint Programming System in Scala / Scala上で実現された制約プログラ...
 

More from Yu Liu

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoYu Liu
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsYu Liu
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocYu Liu
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)Yu Liu
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesYu Liu
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthYu Liu
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionYu Liu
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013Yu Liu
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)Yu Liu
 
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...Yu Liu
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)Yu Liu
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkYu Liu
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmYu Liu
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Yu Liu
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsYu Liu
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce worldYu Liu
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsYu Liu
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)Yu Liu
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopYu Liu
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingYu Liu
 

More from Yu Liu (20)

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDoc
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search Engines
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)
 
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming Framework
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce world
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

A Generate-Test-Aggregate Parallel Programming Library on Spark

  • 1. A Generate-Test-Aggregate Parallel Programming Library Yu Liu1, Kento Emoto2, Zhenjiang Hu3 1The Graduate University for Advanced Studies 2The University of Tokyo 3National Institute of Informatics PPoPP PMAM 2013 Systematic Parallel Programming for MapReduce
  • 2. Outline Introduction to GTA The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 3. Outline Introduction to GTA The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 4. The GTA Programming Methodology  Simple programming pattern 1. Generate all possible solution candidates; 2. Test and filter candidates; 3. Aggregate the valid candidates.  Expressive and code efficient  Covers a large class of problems  Automatic optimization and parallelization ~ Kento Emoto, et.al., [ESOP’12]
  • 5. An Example: The Knapsack Problem Writing a parallel (MapReduce) program for the knapsack problem is not easy. Picture from Wikipedia
  • 6. input: [ (1 $, 2 Kg), (2 $, 6 Kg), (3 $, 10 Kg) ] weight limitation =15 generate: [ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$, 6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ], [(2$, 6 Kg) , (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$, 6 Kg) , (3 $, 10 Kg) ] ] test: [true, true, true, true, true, false, false] filter: [ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$, 6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ] ] aggregate: 0$, 1$, 2 $, 3$, 3$, 4$
  • 7. Naively implementing Knapsack is inefficient (O(2n)). Input (length) Time (ms) 8 30 12 86 16 97 20 2829 24 java.lang.OutOfMemoryError: Java heap space performance of the naïve Knapsack program The GTA fusion theorem is introduced for resolve efficiency problem
  • 8. GTA Fusion mapReduce able predicates generator aggregator map ( mapReduceable.f ) . reduce ( mapReduceable.combine ) MapReduce
  • 9. Definitions of G,T,A Class Name Algebraic Structure Generator polymorphic semiring generator Predicate almost list homomorphism Aggregator semiring homomorphism Ref: K.Emoto [ESOP’12]
  • 10. Main Contributions The implementation of a GTA library  A simple and statically typed GTA-DSL is implemented  Algebraic structures and computations/transformations of them are implemented Evaluation of GTA methodology
  • 11. Outline GTA programming methodology The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 12. Object-oriented Functional Style We defined the basic algebraic structures. Relations/transformations of the algebras are well typed
  • 14. Outline GTA programming methodology The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 15. The users write GTA expressions like: generate(g:GEN) filter(t:Predicate)* aggregate(a:Aggregator) G‧T‧A Programming DSL GEN, Aggregator, Predicate are Scala traits defined in the GTA library
  • 16. Outline GTA programming methodology The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 17. GTA-fusion G+A+T 𝑀𝑎𝑝𝑅𝑒𝑑𝑢𝑐𝑒𝑎𝑏𝑙𝑒[𝑓,⊕] Input x1, x2, x3, … , xn MAP REDUCE table1 tablen f f f f … table1 tablentable2 ⊕ ⊕⊕ … [EuroPar’11]
  • 18. Implementation of GTA Fusion/Optimization The main difficulties: How to define a polymorphic generator How to define a predicate for test How to define intermediate data structures and other algebraic structures
  • 19. Outline GTA programming methodology The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 20. More Examples More examples in the paper and source package:  Extended Knapsack problems  The maximum-segments-sum problem  Finding the most possible sequence (viterbi algorithm) More information on: https://bitbucket.org/inii/gtalib
  • 21. G‧T‧A Building Blocks Our library provides commonly used G·T·A building blocks and users can also implement their own G,T,As.
  • 22. Performance Evaluations Evaluations on EdubaseCluster (Cloud) – Up to 32 VM nodes, each has 3GB RAM, 1 single core CPU – Executed on Spark – an in-memory MR cluster
  • 23. Execution Time (Knapsack) 203.63 92.83 64.64 47.76 37.06 29.78 25.17 23.25 1727.973 679.305 637.33 471.2 362.36 287.08 234.25 223.44 0 200 400 600 800 1000 1200 1400 1600 1800 4 8 12 16 20 24 28 32 Time(second) Number of VM nodes 1.00E+07 items 1.00E+08 items
  • 24. Linear Speedup 0 1 2 3 4 5 6 7 8 9 4 8 12 16 20 24 28 32 speedup number of VM Knapsack ViterbiAlg MSS
  • 25. Outline GTA programming methodology The GTA library  Implementation strategy  Programming interface  Automatic parallelization and optimization Applications and evaluations Conclusions
  • 26. Conclusions We show GTA can be efficiently implemented GTA-DSL can simplify parallel programming  Simple programming model  Good code efficiency GTA-DSL is architecture independent
  • 27. Future Works Enrich the library by more building blocks in terms of G, T, A GTA-DSL can be extended to processing more complex data structures such as tree/graph