Towards Systematic Parallel Programming over MapReduce

Background
Systematic MapReduce programming
Implementation and evaluation
Conclusion and future work
Towards Systematic Parallel Programming over
MapReduce
Yu Liu1, Zhenjiang Hu2, Kiminori Matsuzaki3
1The Graduate University for Advanced Studies, Japan
2National Institute of Informatics, Japan
3School of Information, Kochi University of Technology, Japan
September 1, 2011
Yu Liu1
, Zhenjiang Hu2
, Kiminori Matsuzaki3
Towards Systematic Parallel Programming over MapReduce

Background
About MapReduce
Processing ﬂow
Computation is mapped to three phases: map, shuﬄe and reduce ,
and automatically parallelize it.
Yu Liu1
, Zhenjiang Hu2

Background
Programming with MapReduce
Generally, programmers need to implement the following classes1:
1
This takes the opensource MapReduce framework Hadoop as an example.
Yu Liu1
, Zhenjiang Hu2

Background
Map and Reduce
map function map :: (k1, v1) → [(k2, v2)]
reduce function reduce :: (k2, [v2]) → v3
getPartition function getPartition :: (k, v) → Int
compare function compare :: (k1 → k2) → (1, 0, −1)
Yu Liu1
, Zhenjiang Hu2

Background
An Example: even-odd sum of squares
Example
Given a set of integers, compute the sum of squares of odd
integers and sum of squares of even integers, respectively.
Yu Liu1
, Zhenjiang Hu2

Background
The Parallelism
map phase: fully in parallel
reduce phase: by default there are one reduce task for an
input key but we can resolve it by redeﬁne the partitioner or
use combiners in map phase.
This example can be fully parallelized because there is a
commutative binary operator + to accumulate the values.
Yu Liu1
, Zhenjiang Hu2

Background
Another Example: maximum prefix sum
Example
Given a list of n integers [a0, a1, ..., an−1] , find the maximum sum
of a prefix among all the sums of prefixes:
y0 = a0
y1 = a0 + a1
y2 = a0 + a1 + a2
yn = a0 + a1 + a2 + ... + an−1
assume that the list is too large to be put in a single node
How to compute the MPS in divide-and-conquer fashion?
Yu Liu1
, Zhenjiang Hu2

Background
Another Example: maximum preﬁx sum (2)
How to resolve MPS problem with MapReduce?
Dived the input list to independent splits...
Sort and shuﬄe the intermediate data...
Reduce the values...
The problems are:
The order of input list can not be changed
There is no such an binary operator ⊕ that:
mps[a0, a1, ...an]
= mps[a0, ..., ai ] ⊕ mps[ai+1, ..., aj ] ⊕ mps[aj+1, ..., an]
Yu Liu1
, Zhenjiang Hu2

Background
A Myth of MapReduce
Users do not need to consider much about parallelism, just write
sequential programs ...
Yu Liu1
, Zhenjiang Hu2

Background
The Gap between Sequential Algorithms and MapReduce
Mapping a sequential algorithm to MapReduce(parallelization) is
non-trivial. Programmers of MapReduce need
Systematic way to do programming
High level wrappings to get rid of heavy coding
Guarantees for reliability, eﬃciency, and reusability
Yu Liu1
, Zhenjiang Hu2

Background
Calculational Approach for MapReduce
Approach
programmers express their computation by
two sequential programs
the system generate an intermediate format which matches
the divide-and-conquer fashion, and
automatically implement the intermediate format to
MapReduce
Yu Liu1
, Zhenjiang Hu2

Background
List Homomorphism
List Homomorphisms[Bird, 87]2 are well studied from 90’ in
community of parallel programming[Cole,95] 3.
Deﬁnition of List Homomorphism
If there is a function f and an associative operator and an
identity element id of , such that for any list x and list y
h [a] = f a
h (x ++ y) = h(x) h(y).
Where ++ is the list concatenation.
2
Bird et al, An introduction to the theory of lists, 1987
3
Cole, Parallel programming with list homomorphisms. etc
Yu Liu1
, Zhenjiang Hu2

Background
System Overview
Yu Liu1
, Zhenjiang Hu2

Background
Programming Interface Deﬁnition
In our framework, input sequential functions are deﬁned as:
fold and unfold
fold :: [α] → β
unfold :: β → [α].
The restrictions on fold and unfold are
fold ◦ unfold ◦ fold = fold
There must exist a list homomorphism for fold 4.
4
I.e, fold must be both a leftwards and rightwards function
Yu Liu1
, Zhenjiang Hu2

Background
Actual Java Program for MPS
Yu Liu1
, Zhenjiang Hu2

Background
Homomorphism based framework
Performance evaluation and results
Homomorphism based Calculational Approach
Key points of our approach
Express a sequential algorithm in format of list homomorphism
Implement the list homomorphism by MapReduce
Users’ input programs can be easily tested and reused
Parallelization is automatically done thus high reliability can
be obtained
Combination or fusion of list homomorphisms are still list
homomorphisms thus complex algorithms can be implemented
progressively
Yu Liu1
, Zhenjiang Hu2

Background
Derivation of List Homomorphism
By third homomorphism theorem [Gibbons 96]5 and its
corollaries[Morita, PLDI07]6 we can get a list homomorphism by
given a couple of functions:
List homomorphism ([f , ⊕]) deﬁned by a couple functions
Iﬀ there is a list homomorphism for fold and
fold ◦ unfold ◦ fold = fold, then we have:
f a = fold([a])
a ⊕ b = fold(unfold a ++ unfold b)
5
Gibbons, J.: The third homomorphism theorem
6
Morita, et al.: Automatic inversion generates divide-and-conquer parallel
programs.
Yu Liu1
, Zhenjiang Hu2

Background
Implement a List Homomorphism with MapReduce
To implement list homomorphisms with MapReduce is
straightforward, we only need to consider about two problems:
1 Represent input data as a list
2 Intermediate data should be able to reordered due to the
associativity of binary operator ⊕ of a list homomorphism
Yu Liu1
, Zhenjiang Hu2

Background
The Representation of List in Screwdriver
We use indexed data and sort mechanism to represent a list.
Set of pairs ⇒ list
A list [a, b, c, d, e] can be represented by
{(3, d), (1, b), (2, c), (0, a), (4, e)} and a sortp function.
We can use a preprocessing to produce such pairs form original
input data.
Yu Liu1
, Zhenjiang Hu2

Background
The Representation of Intermediate Data in Screwdriver
To extract parallelism, we need to split a long list to multiple
shorter lists and feed them to diﬀerent Reducers.
Nested pairs ⇒ splited lists
split {(3, d), (1, b), (2, c), (0, a), (4, e)} to:
{((0, 1), b), ((0, 2), c), ((0, 0), a)} and
{((1, 3), d), ((1, 4), e)},
a sortnp function can sort these nested pairs
Yu Liu1
, Zhenjiang Hu2

Background
Processing in 2 passes map-reduce (1)
In 1st pass MapReduce
Map phase
(id, val) → ((parId, id), val)
e.g, parId = id ÷ 100000
Shuﬄe and sort
dispatching all data have same parId to a Reducer, where they are
sorted by the sortnp function
Reduce phase
accumulate values using ⊕ that we have gotten.
Yu Liu1
, Zhenjiang Hu2

Background
Processing in 2 passes map-reduce (2)
In 2nd MapReduce,
Map phase
map all parId to an identity, e.g, parId = 0
Shuﬄe and sort
same as 1st pass. Due to same parId, all data are collected by one
Reducer.
Reduce phase
same
Yu Liu1
, Zhenjiang Hu2

Background
Case Study: MPS
We can define the fold 7and its right inverse unfold as:
fold [a] = (a ↑ 0, a)
fold (x ++ [a]) = let (m, s) =
fold x in (m ↑ (s + a), s + a)
unfold (m, s) = [m, s − m]
The Java implementation have been showed...
7
It is might be still tricky to define a both leftwards and rightwards function
and find its right inverse but some tools can help.
Yu Liu1
, Zhenjiang Hu2

Background
Actual Java Program for MPS
You can run your application on a Hadoop client:
:˜$ hadoop jar your.jar input output resource.xml
Yu Liu1
, Zhenjiang Hu2

Background
Performance Evaluation
We implemented several programs for three problems on our
framework and Hadoop:
1 the maximum-preﬁx-sum problem.
MPS-lh is implemented using our framework’ API.
MPS-mr is implemented by Hadoop API.
2 parallel sum of 64-bit integers
SUM-lh1 (using combiner) and SUM-lh2 (not using combiner)
are implemented by our framework’ API.
SUM-mr is implemented by Hadoop API (using combiner).
3 VAR-lh computes the variance of 32-bit ﬂoating-point
numbers;
Yu Liu1
, Zhenjiang Hu2

Background
Performance Evaluation
Environment: hardware
We conﬁgured clusters with 2, 4, 8, 16 and 32 nodes (virtual
machines). Each computing/data node has one CPU (VM, Xeon
E5530@2.4GHz, 1 core), 3 GB memory.
Test data
64-bit integer list of length 108 (593 MB)
32-bit ﬂoating-point list of length 108(593 MB)
Yu Liu1
, Zhenjiang Hu2

Background
Evaluation Result
The experiment results are summarized :
double the work-nodes, can always get more than 1.8 times speedup
Yu Liu1
, Zhenjiang Hu2

Background
A summary
Our framework introduces slightly overhead
Multi-passes MapReduce processing8
Using and with out using Combiner, the performance are
signiﬁcant diﬀerent.
8
When have commutative binary operator for reduce, it can be optimized to
one pass
Yu Liu1
, Zhenjiang Hu2

Background
Related Work
Parallel programming with list homomorphisms (M.Cole 95)
The Third Homomorphism Theorem(J.Gibbons 96).
Systematic extraction and implementation of
divide-and-conquer parallelism (Gorlatch PLILP96).
Automatic inversion generates divide-and-conquer parallel
programs(Morita et.al., PLDI07).
Google’s Sawzall language 9
9
R. Pike et al., Interpreting the Data: Parallel Analysis with Sawzall
Yu Liu1
, Zhenjiang Hu2

Background
Conclusion
The highlight of our approach:
Hiding the technical details of MapReduce(Hadoop)
Parallelization and optimization is automatically done
Automatic list homomorphism generating approach can be
adopted, thus the MapReduce programming can be highly
automated, reliable, and optimized.
Testing and code-reuse for users becomes simpler.
Yu Liu1
, Zhenjiang Hu2

Background
Future Work
Extension and optimization of current framework
Approaches of automatically generating list homomorphisms
Homomorphism and MapReduce on trees and graphs
Yu Liu1
, Zhenjiang Hu2

Background
The project is hosted on
http://screwdriver.googlecode.com
Yu Liu1
, Zhenjiang Hu2

Towards Systematic Parallel Programming over MapReduce

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Towards Systematic Parallel Programming over MapReduce

Similar to Towards Systematic Parallel Programming over MapReduce (20)

More from Yu Liu

More from Yu Liu (18)

Recently uploaded

Recently uploaded (20)

Towards Systematic Parallel Programming over MapReduce