Your SlideShare is downloading. ×
Multilevel aggregation for Hadoop/MapReduce
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Multilevel aggregation for Hadoop/MapReduce

2,254

Published on

The presentation at Pre Prestrata/Hadoop World Meetup on 23th, Oct, 2012

The presentation at Pre Prestrata/Hadoop World Meetup on 23th, Oct, 2012

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,254
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Multi-level aggregation for Hadoop MapReduce Tsuyoshi Ozawa NTT © 2012 NTT Software Innovation Center
  • 2. Overview• Background • Shuffle cost• Approach • Multi-level aggregation• Progress • Discussion on MAPREDUCE-4502 • Design note is available on this JIRA • Prototyped to launch combiner per node © 2012 NTT Software Innovation Center 2
  • 3. MapReduce Architecture• MapReduce • Programming model for large scale processing • 3 processing phases Map Phase Reduce Phase Shuffle Phase Map Reduce Map Map Reduce Map © 2012 NTT Software Innovation Center 3
  • 4. Shuffle Phase• What happens? • Reducers retrieve the outputs of Mappers • Mapper side read -> Reducer side write• Problem • Can be bottleneck in jobs • Cause disk IO • Cause network IO• Current Solution for aggregation processing • Combiner • Reduce IO by mapper-side aggregation • Apps: WordCount, N-gram, Co-occurrence of freq. WordCount Example: Data is aggregated (apple, 1,1,1,1) => (apple, 4) => Get smaller! (banana, 1,1) => (banana,2) © 2012 NTT Software Innovation Center 4
  • 5. Limitation of combiners• Scope is limited within only one MapTask © 2012 NTT Software Innovation Center 5
  • 6. Limitation of combiners (1) • Scope is limited within only one MapTask 1. Many-core environment • Xeon E5 series : 16 threads /CPU => 16 outputs are generated • These files must be transferred through networkAggregation Per map Map Map Map Map IFile IFile IFile IFile IFile IFile IFile IFile Combiner Combiner Combiner Combiner IFile IFile IFile IFile Still large… Reduce © 2012 NTT Software Innovation Center 6
  • 7. Limitation of combiners(2) • Scope is limited within only one MapTask 1. Many-core environment • Xeon E5 series : 16 threads /CPU => 16 outputs are generated 2. Processing middle scale data(TB scale) • Processing Larger data needs more network bandwidth & disk IO All raw IFile must be sent 10GbE 1GbE over racksAggregation Per map Map Map 1GbE 1GbEIFile IFile IFile IFile Combiner IFile IFile Reducer © 2012 NTT Software Innovation Center 7
  • 8. Multi-level aggregation • Aggregating the result of maps per node /rack Smaller IFile is sent 10GbE over racks 1GbE Map Map 1GbE 1GbEIFile IFile IFile IFile Combiner IFile IFile Reducer Aggregation Aggregation Per Node Per Rack © 2012 NTT Software Innovation Center 8
  • 9. Design Concept• Minimize overhead • Adding new task type causes lots of overheads • Modified Mapper to aggregate at the end stage• Keep the current MapReduce design • Fault tolerance against a few machine failures • Each aggregation must be in Containers for YARN• Point of view from Hadoopers • Easy to switch ON/OFF the feature (ideally, add only one line) Public static void main(String[] argv) { … conf.setCombinerClass(Reducer.class); conf.enableNodeLevelAggregation(); conf.enableRackLevelAggregation(); … } © 2012 NTT Software Innovation Center 9
  • 10. Progress• Prototype • Modified Mapper to call combiner function at the last stage• Benchmark • Environment • 40 nodes • Core 2 Duo 2.4GHz x2 • Memory 4GB • 1GbE • Configuration • Reducer : 1 • Input • Texts generated by RandomTextWriter • Benchmark Program • In-mapper combined Word Count © 2012 NTT Software Innovation Center 10
  • 11. Prototype Benchmark – Job Time - ON OFF• About 2 times faster• Shuffle cost is decreased to 50% at most. © 2012 NTT Software Innovation Center 11
  • 12. TODOs• Node level aggregation with FT• Rack level aggregation with FT • The design note is available at MAPREDUCE-4502 • Need to change umbilical protocol to support FT• Support for High level languages • Pig /Hive support – when issuing “GROUP BY” statement • The other case may be switch off multi-level aggregation © 2012 NTT Software Innovation Center 12
  • 13. Summary• Multi-level aggregation with combining the result of maps per node /rack • Node /rack-level combiner • Needs extended umbilical protocol for FT• Benchmark with prototype version • 1.7 times faster • Can restrict the shuffle costs maximum 50%• TODOs • Fault Tolerance • Pig /Hive support• Special Thanks to have discussion with me, Chris, Karthik, Siddarsh, Robert, Bikas• Any Feedbacks are welcome! © 2012 NTT Software Innovation Center 13

×