How to Debug and Tune Hadoop Alex Rovner Proclivity Systems
 
Tune Your Cluster
Tune Your Cluster Choose optimal number of mappers / reducers per node mapred.tasktracker.map.tasks.maximum
mapred.tasktracker.reduce.tasks.maximum
Oversubscribe the CPU by 20-30% (8 Cores can generally handle 10 slots)
Mappers to reducers ratio 4:3
Tune Hadoop Adjust memory allocations mapred.child.javaopts=-Xmx512M
Use 80% of available memory
Do not oversubscribe memory to avoid swapping
Total Memory  = Map Slots + Reduce Slots + TT + DN + Other Services + OS
Tune Hadoop Increase buffers for sorting and shuffling io.sort.mb & fs.inmemorysize.mb
Set to 60-70% of Java heap size
Set it large enough to avoid disk spills Compress intermediate data mapred.compress.map.output

Tune hadoop