Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Multi-level aggregation for   Hadoop MapReduce                                              Tsuyoshi Ozawa                ...
Overview• Background  • Shuffle cost• Approach  • Multi-level aggregation• Progress  • Discussion on MAPREDUCE-4502    • D...
MapReduce Architecture• MapReduce • Programming model for large scale processing • 3 processing phases   Map Phase        ...
Shuffle Phase• What happens?  • Reducers retrieve the outputs of Mappers    • Mapper side read -> Reducer side write• Prob...
Limitation of combiners• Scope is limited within only one MapTask               © 2012 NTT Software Innovation Center   5
Limitation of combiners (1)   • Scope is limited within only one MapTask          1. Many-core environment                ...
Limitation of combiners(2)   • Scope is limited within only one MapTask           1. Many-core environment              • ...
Multi-level aggregation   • Aggregating the result of maps per node /rack                                    Smaller IFile...
Design Concept• Minimize overhead  • Adding new task type causes lots of overheads  • Modified Mapper to aggregate at the ...
Progress• Prototype  • Modified Mapper to call combiner function at the last    stage• Benchmark  • Environment    •   40 ...
Prototype Benchmark – Job Time -                                      ON               OFF• About 2 times faster• Shuffle ...
TODOs• Node level aggregation with FT• Rack level aggregation with FT  • The design note is available at MAPREDUCE-4502   ...
Summary• Multi-level aggregation with combining the  result of maps per node /rack  • Node /rack-level combiner  • Needs e...
Upcoming SlideShare
Loading in …5
×

Multilevel aggregation for Hadoop/MapReduce

4,216 views

Published on

The presentation at Pre Prestrata/Hadoop World Meetup on 23th, Oct, 2012

Published in: Technology
  • Be the first to comment

Multilevel aggregation for Hadoop/MapReduce

  1. 1. Multi-level aggregation for Hadoop MapReduce Tsuyoshi Ozawa NTT © 2012 NTT Software Innovation Center
  2. 2. Overview• Background • Shuffle cost• Approach • Multi-level aggregation• Progress • Discussion on MAPREDUCE-4502 • Design note is available on this JIRA • Prototyped to launch combiner per node © 2012 NTT Software Innovation Center 2
  3. 3. MapReduce Architecture• MapReduce • Programming model for large scale processing • 3 processing phases Map Phase Reduce Phase Shuffle Phase Map Reduce Map Map Reduce Map © 2012 NTT Software Innovation Center 3
  4. 4. Shuffle Phase• What happens? • Reducers retrieve the outputs of Mappers • Mapper side read -> Reducer side write• Problem • Can be bottleneck in jobs • Cause disk IO • Cause network IO• Current Solution for aggregation processing • Combiner • Reduce IO by mapper-side aggregation • Apps: WordCount, N-gram, Co-occurrence of freq. WordCount Example: Data is aggregated (apple, 1,1,1,1) => (apple, 4) => Get smaller! (banana, 1,1) => (banana,2) © 2012 NTT Software Innovation Center 4
  5. 5. Limitation of combiners• Scope is limited within only one MapTask © 2012 NTT Software Innovation Center 5
  6. 6. Limitation of combiners (1) • Scope is limited within only one MapTask 1. Many-core environment • Xeon E5 series : 16 threads /CPU => 16 outputs are generated • These files must be transferred through networkAggregation Per map Map Map Map Map IFile IFile IFile IFile IFile IFile IFile IFile Combiner Combiner Combiner Combiner IFile IFile IFile IFile Still large… Reduce © 2012 NTT Software Innovation Center 6
  7. 7. Limitation of combiners(2) • Scope is limited within only one MapTask 1. Many-core environment • Xeon E5 series : 16 threads /CPU => 16 outputs are generated 2. Processing middle scale data(TB scale) • Processing Larger data needs more network bandwidth & disk IO All raw IFile must be sent 10GbE 1GbE over racksAggregation Per map Map Map 1GbE 1GbEIFile IFile IFile IFile Combiner IFile IFile Reducer © 2012 NTT Software Innovation Center 7
  8. 8. Multi-level aggregation • Aggregating the result of maps per node /rack Smaller IFile is sent 10GbE over racks 1GbE Map Map 1GbE 1GbEIFile IFile IFile IFile Combiner IFile IFile Reducer Aggregation Aggregation Per Node Per Rack © 2012 NTT Software Innovation Center 8
  9. 9. Design Concept• Minimize overhead • Adding new task type causes lots of overheads • Modified Mapper to aggregate at the end stage• Keep the current MapReduce design • Fault tolerance against a few machine failures • Each aggregation must be in Containers for YARN• Point of view from Hadoopers • Easy to switch ON/OFF the feature (ideally, add only one line) Public static void main(String[] argv) { … conf.setCombinerClass(Reducer.class); conf.enableNodeLevelAggregation(); conf.enableRackLevelAggregation(); … } © 2012 NTT Software Innovation Center 9
  10. 10. Progress• Prototype • Modified Mapper to call combiner function at the last stage• Benchmark • Environment • 40 nodes • Core 2 Duo 2.4GHz x2 • Memory 4GB • 1GbE • Configuration • Reducer : 1 • Input • Texts generated by RandomTextWriter • Benchmark Program • In-mapper combined Word Count © 2012 NTT Software Innovation Center 10
  11. 11. Prototype Benchmark – Job Time - ON OFF• About 2 times faster• Shuffle cost is decreased to 50% at most. © 2012 NTT Software Innovation Center 11
  12. 12. TODOs• Node level aggregation with FT• Rack level aggregation with FT • The design note is available at MAPREDUCE-4502 • Need to change umbilical protocol to support FT• Support for High level languages • Pig /Hive support – when issuing “GROUP BY” statement • The other case may be switch off multi-level aggregation © 2012 NTT Software Innovation Center 12
  13. 13. Summary• Multi-level aggregation with combining the result of maps per node /rack • Node /rack-level combiner • Needs extended umbilical protocol for FT• Benchmark with prototype version • 1.7 times faster • Can restrict the shuffle costs maximum 50%• TODOs • Fault Tolerance • Pig /Hive support• Special Thanks to have discussion with me, Chris, Karthik, Siddarsh, Robert, Bikas• Any Feedbacks are welcome! © 2012 NTT Software Innovation Center 13

×