SlideShare a Scribd company logo
1 of 32
Download to read offline
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Towards Systematic Parallel Programming over
MapReduce
Yu Liu1, Zhenjiang Hu2, Kiminori Matsuzaki3
1The Graduate University for Advanced Studies, Japan
2National Institute of Informatics, Japan
3School of Information, Kochi University of Technology, Japan
September 1, 2011
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
About MapReduce
Processing flow
Computation is mapped to three phases: map, shuffle and reduce ,
and automatically parallelize it.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Programming with MapReduce
Generally, programmers need to implement the following classes1:
1
This takes the opensource MapReduce framework Hadoop as an example.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Map and Reduce
map function map :: (k1, v1) → [(k2, v2)]
reduce function reduce :: (k2, [v2]) → v3
getPartition function getPartition :: (k, v) → Int
compare function compare :: (k1 → k2) → (1, 0, −1)
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
An Example: even-odd sum of squares
Example
Given a set of integers, compute the sum of squares of odd
integers and sum of squares of even integers, respectively.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
The Parallelism
map phase: fully in parallel
reduce phase: by default there are one reduce task for an
input key but we can resolve it by redefine the partitioner or
use combiners in map phase.
This example can be fully parallelized because there is a
commutative binary operator + to accumulate the values.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Another Example: maximum prefix sum
Example
Given a list of n integers [a0, a1, ..., an−1] , find the maximum sum
of a prefix among all the sums of prefixes:
y0 = a0
y1 = a0 + a1
y2 = a0 + a1 + a2
yn = a0 + a1 + a2 + ... + an−1
assume that the list is too large to be put in a single node
How to compute the MPS in divide-and-conquer fashion?
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Another Example: maximum prefix sum (2)
How to resolve MPS problem with MapReduce?
Dived the input list to independent splits...
Sort and shuffle the intermediate data...
Reduce the values...
The problems are:
The order of input list can not be changed
There is no such an binary operator ⊕ that:
mps[a0, a1, ...an]
= mps[a0, ..., ai ] ⊕ mps[ai+1, ..., aj ] ⊕ mps[aj+1, ..., an]
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
A Myth of MapReduce
Users do not need to consider much about parallelism, just write
sequential programs ...
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
The Gap between Sequential Algorithms and MapReduce
Mapping a sequential algorithm to MapReduce(parallelization) is
non-trivial. Programmers of MapReduce need
Systematic way to do programming
High level wrappings to get rid of heavy coding
Guarantees for reliability, efficiency, and reusability
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Calculational Approach for MapReduce
Approach
programmers express their computation by
two sequential programs
the system generate an intermediate format which matches
the divide-and-conquer fashion, and
automatically implement the intermediate format to
MapReduce
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
List Homomorphism
List Homomorphisms[Bird, 87]2 are well studied from 90’ in
community of parallel programming[Cole,95] 3.
Definition of List Homomorphism
If there is a function f and an associative operator and an
identity element id of , such that for any list x and list y
h [a] = f a
h (x ++ y) = h(x) h(y).
Where ++ is the list concatenation.
2
Bird et al, An introduction to the theory of lists, 1987
3
Cole, Parallel programming with list homomorphisms. etc
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
System Overview
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Programming Interface Definition
In our framework, input sequential functions are defined as:
fold and unfold
fold :: [α] → β
unfold :: β → [α].
The restrictions on fold and unfold are
fold ◦ unfold ◦ fold = fold
There must exist a list homomorphism for fold 4.
4
I.e, fold must be both a leftwards and rightwards function
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Actual Java Program for MPS
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Homomorphism based Calculational Approach
Key points of our approach
Express a sequential algorithm in format of list homomorphism
Implement the list homomorphism by MapReduce
Users’ input programs can be easily tested and reused
Parallelization is automatically done thus high reliability can
be obtained
Combination or fusion of list homomorphisms are still list
homomorphisms thus complex algorithms can be implemented
progressively
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Derivation of List Homomorphism
By third homomorphism theorem [Gibbons 96]5 and its
corollaries[Morita, PLDI07]6 we can get a list homomorphism by
given a couple of functions:
List homomorphism ([f , ⊕]) defined by a couple functions
Iff there is a list homomorphism for fold and
fold ◦ unfold ◦ fold = fold, then we have:
f a = fold([a])
a ⊕ b = fold(unfold a ++ unfold b)
5
Gibbons, J.: The third homomorphism theorem
6
Morita, et al.: Automatic inversion generates divide-and-conquer parallel
programs.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Implement a List Homomorphism with MapReduce
To implement list homomorphisms with MapReduce is
straightforward, we only need to consider about two problems:
1 Represent input data as a list
2 Intermediate data should be able to reordered due to the
associativity of binary operator ⊕ of a list homomorphism
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
The Representation of List in Screwdriver
We use indexed data and sort mechanism to represent a list.
Set of pairs ⇒ list
A list [a, b, c, d, e] can be represented by
{(3, d), (1, b), (2, c), (0, a), (4, e)} and a sortp function.
We can use a preprocessing to produce such pairs form original
input data.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
The Representation of Intermediate Data in Screwdriver
To extract parallelism, we need to split a long list to multiple
shorter lists and feed them to different Reducers.
Nested pairs ⇒ splited lists
split {(3, d), (1, b), (2, c), (0, a), (4, e)} to:
{((0, 1), b), ((0, 2), c), ((0, 0), a)} and
{((1, 3), d), ((1, 4), e)},
a sortnp function can sort these nested pairs
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Processing in 2 passes map-reduce (1)
In 1st pass MapReduce
Map phase
(id, val) → ((parId, id), val)
e.g, parId = id ÷ 100000
Shuffle and sort
dispatching all data have same parId to a Reducer, where they are
sorted by the sortnp function
Reduce phase
accumulate values using ⊕ that we have gotten.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Processing in 2 passes map-reduce (2)
In 2nd MapReduce,
Map phase
map all parId to an identity, e.g, parId = 0
Shuffle and sort
same as 1st pass. Due to same parId, all data are collected by one
Reducer.
Reduce phase
same
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Case Study: MPS
We can define the fold 7and its right inverse unfold as:
fold [a] = (a ↑ 0, a)
fold (x ++ [a]) = let (m, s) =
fold x in (m ↑ (s + a), s + a)
unfold (m, s) = [m, s − m]
The Java implementation have been showed...
7
It is might be still tricky to define a both leftwards and rightwards function
and find its right inverse but some tools can help.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Actual Java Program for MPS
You can run your application on a Hadoop client:
:˜$ hadoop jar your.jar input output resource.xml
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Performance Evaluation
We implemented several programs for three problems on our
framework and Hadoop:
1 the maximum-prefix-sum problem.
MPS-lh is implemented using our framework’ API.
MPS-mr is implemented by Hadoop API.
2 parallel sum of 64-bit integers
SUM-lh1 (using combiner) and SUM-lh2 (not using combiner)
are implemented by our framework’ API.
SUM-mr is implemented by Hadoop API (using combiner).
3 VAR-lh computes the variance of 32-bit floating-point
numbers;
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Performance Evaluation
Environment: hardware
We configured clusters with 2, 4, 8, 16 and 32 nodes (virtual
machines). Each computing/data node has one CPU (VM, Xeon
E5530@2.4GHz, 1 core), 3 GB memory.
Test data
64-bit integer list of length 108 (593 MB)
32-bit floating-point list of length 108(593 MB)
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Evaluation Result
The experiment results are summarized :
double the work-nodes, can always get more than 1.8 times speedup
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
A summary
Our framework introduces slightly overhead
Multi-passes MapReduce processing8
Using and with out using Combiner, the performance are
significant different.
8
When have commutative binary operator for reduce, it can be optimized to
one pass
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Homomorphism based framework
Performance evaluation and results
Related Work
Parallel programming with list homomorphisms (M.Cole 95)
The Third Homomorphism Theorem(J.Gibbons 96).
Systematic extraction and implementation of
divide-and-conquer parallelism (Gorlatch PLILP96).
Automatic inversion generates divide-and-conquer parallel
programs(Morita et.al., PLDI07).
Google’s Sawzall language 9
9
R. Pike et al., Interpreting the Data: Parallel Analysis with Sawzall
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Conclusion
The highlight of our approach:
Hiding the technical details of MapReduce(Hadoop)
Parallelization and optimization is automatically done
Automatic list homomorphism generating approach can be
adopted, thus the MapReduce programming can be highly
automated, reliable, and optimized.
Testing and code-reuse for users becomes simpler.
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Future Work
Extension and optimization of current framework
Approaches of automatically generating list homomorphisms
Homomorphism and MapReduce on trees and graphs
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce
Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
The project is hosted on
http://screwdriver.googlecode.com
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce

More Related Content

What's hot

Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2
 
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...MaximilianHoffmann7
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmNilaNila16
 
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...Informed Machine Learning for Improved Similarity Assessment in Process-Orien...
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...MaximilianHoffmann7
 
Industrial Facility Design
Industrial Facility DesignIndustrial Facility Design
Industrial Facility DesignUmar Yazdani
 
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large Data
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large DataExtended Fuzzy C-Means with Random Sampling Techniques for Clustering Large Data
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large DataAM Publications
 
Traffic models and estimation
Traffic models and estimation Traffic models and estimation
Traffic models and estimation Mina Yonan
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsMLAI2
 
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMS
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMSA COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMS
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMSecij
 
Adbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationAdbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationVaibhav Khanna
 
Circular Traffic Signs Recognition Using The Number of Peaks Algorithm
Circular Traffic Signs Recognition Using The Number of Peaks AlgorithmCircular Traffic Signs Recognition Using The Number of Peaks Algorithm
Circular Traffic Signs Recognition Using The Number of Peaks AlgorithmCSCJournals
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLJanani C
 
Contour Line Tracing Algorithm for Digital Topographic Maps
Contour Line Tracing Algorithm for Digital Topographic MapsContour Line Tracing Algorithm for Digital Topographic Maps
Contour Line Tracing Algorithm for Digital Topographic MapsCSCJournals
 
Craft software for dummies
Craft software for dummiesCraft software for dummies
Craft software for dummiesRama Renspandy
 
論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesianShuuji Mihara
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...Aboul Ella Hassanien
 
Information Spread in the Context of Evacuation Optimization
Information Spread in the Context of Evacuation OptimizationInformation Spread in the Context of Evacuation Optimization
Information Spread in the Context of Evacuation OptimizationDr. Mirko Kämpf
 
A unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityA unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityeSAT Journals
 

What's hot (20)

Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...
 
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process...
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...Informed Machine Learning for Improved Similarity Assessment in Process-Orien...
Informed Machine Learning for Improved Similarity Assessment in Process-Orien...
 
Industrial Facility Design
Industrial Facility DesignIndustrial Facility Design
Industrial Facility Design
 
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large Data
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large DataExtended Fuzzy C-Means with Random Sampling Techniques for Clustering Large Data
Extended Fuzzy C-Means with Random Sampling Techniques for Clustering Large Data
 
Traffic models and estimation
Traffic models and estimation Traffic models and estimation
Traffic models and estimation
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
MS Project
MS ProjectMS Project
MS Project
 
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMS
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMSA COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMS
A COMPARISON BETWEEN SWARM INTELLIGENCE ALGORITHMS FOR ROUTING PROBLEMS
 
Adbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimizationAdbms 40 heuristics in query optimization
Adbms 40 heuristics in query optimization
 
Circular Traffic Signs Recognition Using The Number of Peaks Algorithm
Circular Traffic Signs Recognition Using The Number of Peaks AlgorithmCircular Traffic Signs Recognition Using The Number of Peaks Algorithm
Circular Traffic Signs Recognition Using The Number of Peaks Algorithm
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
 
Matrix Multiplication Report
Matrix Multiplication ReportMatrix Multiplication Report
Matrix Multiplication Report
 
Contour Line Tracing Algorithm for Digital Topographic Maps
Contour Line Tracing Algorithm for Digital Topographic MapsContour Line Tracing Algorithm for Digital Topographic Maps
Contour Line Tracing Algorithm for Digital Topographic Maps
 
Craft software for dummies
Craft software for dummiesCraft software for dummies
Craft software for dummies
 
論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...
 
Information Spread in the Context of Evacuation Optimization
Information Spread in the Context of Evacuation OptimizationInformation Spread in the Context of Evacuation Optimization
Information Spread in the Context of Evacuation Optimization
 
A unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexityA unique sorting algorithm with linear time & space complexity
A unique sorting algorithm with linear time & space complexity
 

Viewers also liked

History of the pacific sardine fishery and its
History of the pacific sardine fishery and itsHistory of the pacific sardine fishery and its
History of the pacific sardine fishery and itsRobin Shin
 
Improvement in food resources
Improvement in food resourcesImprovement in food resources
Improvement in food resourcesRekha Singh
 
Marine fisheries management in India with special reference to Tamil Nadu
Marine fisheries management in India with special reference to Tamil NaduMarine fisheries management in India with special reference to Tamil Nadu
Marine fisheries management in India with special reference to Tamil Naduhmkasim
 
Egg inoculation technique complete 1
Egg inoculation technique complete   1Egg inoculation technique complete   1
Egg inoculation technique complete 1DrNoman Alam
 
Fresh and marine fish and shell fish
Fresh and marine fish and shell fishFresh and marine fish and shell fish
Fresh and marine fish and shell fishSameer Chebbi
 
MARKETING OF FISH IN INDIA DEMAND
MARKETING OF FISH IN INDIA DEMANDMARKETING OF FISH IN INDIA DEMAND
MARKETING OF FISH IN INDIA DEMANDDEVIKA ANTHARJANAM
 
Vertebrates and invertebrates animals
Vertebrates and invertebrates animalsVertebrates and invertebrates animals
Vertebrates and invertebrates animalsMARIAMC_TEACHER
 

Viewers also liked (12)

Bombay duck2
Bombay duck2Bombay duck2
Bombay duck2
 
Fishing
FishingFishing
Fishing
 
Sardine
SardineSardine
Sardine
 
Lesser sardines
Lesser sardines Lesser sardines
Lesser sardines
 
History of the pacific sardine fishery and its
History of the pacific sardine fishery and itsHistory of the pacific sardine fishery and its
History of the pacific sardine fishery and its
 
Improvement in food resources
Improvement in food resourcesImprovement in food resources
Improvement in food resources
 
Marine fisheries management in India with special reference to Tamil Nadu
Marine fisheries management in India with special reference to Tamil NaduMarine fisheries management in India with special reference to Tamil Nadu
Marine fisheries management in India with special reference to Tamil Nadu
 
Egg inoculation technique complete 1
Egg inoculation technique complete   1Egg inoculation technique complete   1
Egg inoculation technique complete 1
 
Fresh and marine fish and shell fish
Fresh and marine fish and shell fishFresh and marine fish and shell fish
Fresh and marine fish and shell fish
 
1449 - Azolla Rice Duck Fish Farming Systems with SRI Methods in the Mekong R...
1449 - Azolla Rice Duck Fish Farming Systems with SRI Methods in the Mekong R...1449 - Azolla Rice Duck Fish Farming Systems with SRI Methods in the Mekong R...
1449 - Azolla Rice Duck Fish Farming Systems with SRI Methods in the Mekong R...
 
MARKETING OF FISH IN INDIA DEMAND
MARKETING OF FISH IN INDIA DEMANDMARKETING OF FISH IN INDIA DEMAND
MARKETING OF FISH IN INDIA DEMAND
 
Vertebrates and invertebrates animals
Vertebrates and invertebrates animalsVertebrates and invertebrates animals
Vertebrates and invertebrates animals
 

Similar to Towards Systematic Parallel Programming over MapReduce

A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...Yu Liu
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013Yu Liu
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)Yu Liu
 
A study of the Behavior of Floating-Point Errors
A study of the Behavior of Floating-Point ErrorsA study of the Behavior of Floating-Point Errors
A study of the Behavior of Floating-Point Errorsijpla
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingYu Liu
 
A review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationA review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationssuserfa7e73
 
Analysis of Various Single Frame Super Resolution Techniques for better PSNR
Analysis of Various Single Frame Super Resolution Techniques for better PSNRAnalysis of Various Single Frame Super Resolution Techniques for better PSNR
Analysis of Various Single Frame Super Resolution Techniques for better PSNRIRJET Journal
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
 
TurnerBottoneStanekNIPS2013
TurnerBottoneStanekNIPS2013TurnerBottoneStanekNIPS2013
TurnerBottoneStanekNIPS2013Clay Stanek
 
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMGRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMIJCSEA Journal
 
SPoC: search-based pseudocode to code
SPoC: search-based pseudocode to codeSPoC: search-based pseudocode to code
SPoC: search-based pseudocode to codeMinhazul Arefin
 
Information-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upInformation-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upEswar Publications
 
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME ijdpsjournal
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...IJERA Editor
 
Data clustering using map reduce
Data clustering using map reduceData clustering using map reduce
Data clustering using map reduceVarad Meru
 
Gauss Elimination (without pivot).pptx
Gauss Elimination (without pivot).pptxGauss Elimination (without pivot).pptx
Gauss Elimination (without pivot).pptxSaloni Singhal
 
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)Subhajit Sahu
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 

Similar to Towards Systematic Parallel Programming over MapReduce (20)

A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
A Homomorphism-based Framework for Systematic Parallel Programming with MapRe...
 
An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013An accumulative computation framework on MapReduce ppl2013
An accumulative computation framework on MapReduce ppl2013
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
 
A study of the Behavior of Floating-Point Errors
A study of the Behavior of Floating-Point ErrorsA study of the Behavior of Floating-Point Errors
A study of the Behavior of Floating-Point Errors
 
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel ProgrammingA Homomorphism-based MapReduce Framework for Systematic Parallel Programming
A Homomorphism-based MapReduce Framework for Systematic Parallel Programming
 
A review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationA review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementation
 
Analysis of Various Single Frame Super Resolution Techniques for better PSNR
Analysis of Various Single Frame Super Resolution Techniques for better PSNRAnalysis of Various Single Frame Super Resolution Techniques for better PSNR
Analysis of Various Single Frame Super Resolution Techniques for better PSNR
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
 
TurnerBottoneStanekNIPS2013
TurnerBottoneStanekNIPS2013TurnerBottoneStanekNIPS2013
TurnerBottoneStanekNIPS2013
 
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMGRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
 
SPoC: search-based pseudocode to code
SPoC: search-based pseudocode to codeSPoC: search-based pseudocode to code
SPoC: search-based pseudocode to code
 
Information-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upInformation-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks up
 
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
 
pdf
pdfpdf
pdf
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...
 
Data clustering using map reduce
Data clustering using map reduceData clustering using map reduce
Data clustering using map reduce
 
Gauss Elimination (without pivot).pptx
Gauss Elimination (without pivot).pptxGauss Elimination (without pivot).pptx
Gauss Elimination (without pivot).pptx
 
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 

More from Yu Liu

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoYu Liu
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsYu Liu
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocYu Liu
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)Yu Liu
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesYu Liu
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthYu Liu
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionYu Liu
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)Yu Liu
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)Yu Liu
 
A Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on SparkA Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on SparkYu Liu
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkYu Liu
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmYu Liu
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Yu Liu
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsYu Liu
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce worldYu Liu
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsYu Liu
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopYu Liu
 
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopYu Liu
 

More from Yu Liu (18)

A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
 
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and SolutionsCloud Era Transactional Processing -- Problems, Strategies and Solutions
Cloud Era Transactional Processing -- Problems, Strategies and Solutions
 
Introduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDocIntroduction to NTCIR 2016 MedNLPDoc
Introduction to NTCIR 2016 MedNLPDoc
 
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
高性能データ処理プラットフォーム (Talk on July Tech Festa 2015)
 
Survey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search EnginesSurvey on Parallel/Distributed Search Engines
Survey on Parallel/Distributed Search Engines
 
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded TreewidthPaper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
 
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set SelectionPaper Introduction: Combinatorial Model and Bounds for Target Set Selection
Paper Introduction: Combinatorial Model and Bounds for Target Set Selection
 
An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)An Enhanced MapReduce Model (on BSP)
An Enhanced MapReduce Model (on BSP)
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)
 
A Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on SparkA Generate-Test-Aggregate Parallel Programming Library on Spark
A Generate-Test-Aggregate Parallel Programming Library on Spark
 
Introduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming FrameworkIntroduction of A Lightweight Stage-Programming Framework
Introduction of A Lightweight Stage-Programming Framework
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
 
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...Introduction of the Design of A High-level Language over MapReduce -- The Pig...
Introduction of the Design of A High-level Language over MapReduce -- The Pig...
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce world
 
Introduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applicationsIntroduction to Ultra-succinct representation of ordered trees with applications
Introduction to Ultra-succinct representation of ordered trees with applications
 
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on HadoopScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
ScrewDriver Rebirth: Generate-Test-and-Aggregate Framework on Hadoop
 
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on HadoopImplementing Generate-Test-and-Aggregate Algorithms on Hadoop
Implementing Generate-Test-and-Aggregate Algorithms on Hadoop
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Towards Systematic Parallel Programming over MapReduce

  • 1. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Towards Systematic Parallel Programming over MapReduce Yu Liu1, Zhenjiang Hu2, Kiminori Matsuzaki3 1The Graduate University for Advanced Studies, Japan 2National Institute of Informatics, Japan 3School of Information, Kochi University of Technology, Japan September 1, 2011 Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 2. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work About MapReduce Processing flow Computation is mapped to three phases: map, shuffle and reduce , and automatically parallelize it. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 3. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Programming with MapReduce Generally, programmers need to implement the following classes1: 1 This takes the opensource MapReduce framework Hadoop as an example. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 4. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Map and Reduce map function map :: (k1, v1) → [(k2, v2)] reduce function reduce :: (k2, [v2]) → v3 getPartition function getPartition :: (k, v) → Int compare function compare :: (k1 → k2) → (1, 0, −1) Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 5. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work An Example: even-odd sum of squares Example Given a set of integers, compute the sum of squares of odd integers and sum of squares of even integers, respectively. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 6. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work The Parallelism map phase: fully in parallel reduce phase: by default there are one reduce task for an input key but we can resolve it by redefine the partitioner or use combiners in map phase. This example can be fully parallelized because there is a commutative binary operator + to accumulate the values. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 7. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Another Example: maximum prefix sum Example Given a list of n integers [a0, a1, ..., an−1] , find the maximum sum of a prefix among all the sums of prefixes: y0 = a0 y1 = a0 + a1 y2 = a0 + a1 + a2 yn = a0 + a1 + a2 + ... + an−1 assume that the list is too large to be put in a single node How to compute the MPS in divide-and-conquer fashion? Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 8. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Another Example: maximum prefix sum (2) How to resolve MPS problem with MapReduce? Dived the input list to independent splits... Sort and shuffle the intermediate data... Reduce the values... The problems are: The order of input list can not be changed There is no such an binary operator ⊕ that: mps[a0, a1, ...an] = mps[a0, ..., ai ] ⊕ mps[ai+1, ..., aj ] ⊕ mps[aj+1, ..., an] Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 9. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work A Myth of MapReduce Users do not need to consider much about parallelism, just write sequential programs ... Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 10. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work The Gap between Sequential Algorithms and MapReduce Mapping a sequential algorithm to MapReduce(parallelization) is non-trivial. Programmers of MapReduce need Systematic way to do programming High level wrappings to get rid of heavy coding Guarantees for reliability, efficiency, and reusability Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 11. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Calculational Approach for MapReduce Approach programmers express their computation by two sequential programs the system generate an intermediate format which matches the divide-and-conquer fashion, and automatically implement the intermediate format to MapReduce Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 12. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work List Homomorphism List Homomorphisms[Bird, 87]2 are well studied from 90’ in community of parallel programming[Cole,95] 3. Definition of List Homomorphism If there is a function f and an associative operator and an identity element id of , such that for any list x and list y h [a] = f a h (x ++ y) = h(x) h(y). Where ++ is the list concatenation. 2 Bird et al, An introduction to the theory of lists, 1987 3 Cole, Parallel programming with list homomorphisms. etc Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 13. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work System Overview Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 14. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Programming Interface Definition In our framework, input sequential functions are defined as: fold and unfold fold :: [α] → β unfold :: β → [α]. The restrictions on fold and unfold are fold ◦ unfold ◦ fold = fold There must exist a list homomorphism for fold 4. 4 I.e, fold must be both a leftwards and rightwards function Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 15. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Actual Java Program for MPS Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 16. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Homomorphism based Calculational Approach Key points of our approach Express a sequential algorithm in format of list homomorphism Implement the list homomorphism by MapReduce Users’ input programs can be easily tested and reused Parallelization is automatically done thus high reliability can be obtained Combination or fusion of list homomorphisms are still list homomorphisms thus complex algorithms can be implemented progressively Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 17. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Derivation of List Homomorphism By third homomorphism theorem [Gibbons 96]5 and its corollaries[Morita, PLDI07]6 we can get a list homomorphism by given a couple of functions: List homomorphism ([f , ⊕]) defined by a couple functions Iff there is a list homomorphism for fold and fold ◦ unfold ◦ fold = fold, then we have: f a = fold([a]) a ⊕ b = fold(unfold a ++ unfold b) 5 Gibbons, J.: The third homomorphism theorem 6 Morita, et al.: Automatic inversion generates divide-and-conquer parallel programs. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 18. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Implement a List Homomorphism with MapReduce To implement list homomorphisms with MapReduce is straightforward, we only need to consider about two problems: 1 Represent input data as a list 2 Intermediate data should be able to reordered due to the associativity of binary operator ⊕ of a list homomorphism Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 19. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results The Representation of List in Screwdriver We use indexed data and sort mechanism to represent a list. Set of pairs ⇒ list A list [a, b, c, d, e] can be represented by {(3, d), (1, b), (2, c), (0, a), (4, e)} and a sortp function. We can use a preprocessing to produce such pairs form original input data. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 20. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results The Representation of Intermediate Data in Screwdriver To extract parallelism, we need to split a long list to multiple shorter lists and feed them to different Reducers. Nested pairs ⇒ splited lists split {(3, d), (1, b), (2, c), (0, a), (4, e)} to: {((0, 1), b), ((0, 2), c), ((0, 0), a)} and {((1, 3), d), ((1, 4), e)}, a sortnp function can sort these nested pairs Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 21. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Processing in 2 passes map-reduce (1) In 1st pass MapReduce Map phase (id, val) → ((parId, id), val) e.g, parId = id ÷ 100000 Shuffle and sort dispatching all data have same parId to a Reducer, where they are sorted by the sortnp function Reduce phase accumulate values using ⊕ that we have gotten. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 22. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Processing in 2 passes map-reduce (2) In 2nd MapReduce, Map phase map all parId to an identity, e.g, parId = 0 Shuffle and sort same as 1st pass. Due to same parId, all data are collected by one Reducer. Reduce phase same Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 23. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Case Study: MPS We can define the fold 7and its right inverse unfold as: fold [a] = (a ↑ 0, a) fold (x ++ [a]) = let (m, s) = fold x in (m ↑ (s + a), s + a) unfold (m, s) = [m, s − m] The Java implementation have been showed... 7 It is might be still tricky to define a both leftwards and rightwards function and find its right inverse but some tools can help. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 24. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Actual Java Program for MPS You can run your application on a Hadoop client: :˜$ hadoop jar your.jar input output resource.xml Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 25. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Performance Evaluation We implemented several programs for three problems on our framework and Hadoop: 1 the maximum-prefix-sum problem. MPS-lh is implemented using our framework’ API. MPS-mr is implemented by Hadoop API. 2 parallel sum of 64-bit integers SUM-lh1 (using combiner) and SUM-lh2 (not using combiner) are implemented by our framework’ API. SUM-mr is implemented by Hadoop API (using combiner). 3 VAR-lh computes the variance of 32-bit floating-point numbers; Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 26. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Performance Evaluation Environment: hardware We configured clusters with 2, 4, 8, 16 and 32 nodes (virtual machines). Each computing/data node has one CPU (VM, Xeon E5530@2.4GHz, 1 core), 3 GB memory. Test data 64-bit integer list of length 108 (593 MB) 32-bit floating-point list of length 108(593 MB) Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 27. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Evaluation Result The experiment results are summarized : double the work-nodes, can always get more than 1.8 times speedup Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 28. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results A summary Our framework introduces slightly overhead Multi-passes MapReduce processing8 Using and with out using Combiner, the performance are significant different. 8 When have commutative binary operator for reduce, it can be optimized to one pass Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 29. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Homomorphism based framework Performance evaluation and results Related Work Parallel programming with list homomorphisms (M.Cole 95) The Third Homomorphism Theorem(J.Gibbons 96). Systematic extraction and implementation of divide-and-conquer parallelism (Gorlatch PLILP96). Automatic inversion generates divide-and-conquer parallel programs(Morita et.al., PLDI07). Google’s Sawzall language 9 9 R. Pike et al., Interpreting the Data: Parallel Analysis with Sawzall Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 30. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Conclusion The highlight of our approach: Hiding the technical details of MapReduce(Hadoop) Parallelization and optimization is automatically done Automatic list homomorphism generating approach can be adopted, thus the MapReduce programming can be highly automated, reliable, and optimized. Testing and code-reuse for users becomes simpler. Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 31. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work Future Work Extension and optimization of current framework Approaches of automatically generating list homomorphisms Homomorphism and MapReduce on trees and graphs Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce
  • 32. Background Systematic MapReduce programming Implementation and evaluation Conclusion and future work The project is hosted on http://screwdriver.googlecode.com Yu Liu1 , Zhenjiang Hu2 , Kiminori Matsuzaki3 Towards Systematic Parallel Programming over MapReduce