Leveraging your hadoop cluster better - running performant code at scale


Published on

Somebody once said that hadoop is a way of running highly unperformant code at scale. In this talk I want to show how we can change that and make map reduce jobs more performant. I will show how to analyze them at scale and optimize the job itself, instead of just tinkering with hadoop options. The result is a much better utilized cluster and jobs that run in a fraction of the original time running performant code at scale! Most of the time when speaking about Hadoop people only consider scale, however, when looking at it it very often runs highly unperformant jobs. By actually looking at the performance characteristics of the jobs themselves and optimizing and tuning those far better results can be achieved. Examples include small changes that cut jobs down from 15 hours to 2 hours without adding any more hardware. The concepts and techniques explained in the talk will be applicable regardless which tool is used to identify the performance characteristics, what is important is that by applying performance analysis and optimization techniques that we have used on other applications for a long time we can make hadoop jobs much more effective and performant! The attendees will be able to understand those techniques and apply them to their map/reduce/PIG/hive or other mapreduce jobs.

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Why did I do this talk, well this is it.
  • In other words, from a cluster perspective efficency means using every resource available! Not being idle.
  • I could simply add more map and reduce slots and try to pound the cluster. But that might not be really good for all jobs and further more at some point I will run into load average issues, meaning too much scheduling and becomes counter productive.
  • We want to figure out which Jobs are running, which consume most most of my cluster but at the same time don’t consume the resources. E.g. we can compare time vs. CPU time used by a job
  • The same we can do on a per user or pool basis. By using these two methods we quickly figure out which job/user occupies the cluster but is not running optimally. We will then look at those closer.
  • What do hotspot analysis tools do? Well if you are a developer you know what a Profiler is doing, it tells you where you spend most of your time and also CPU. The problem is that profilers can not be run distributed and they have a horrible impact on performance, they also distort hotspots if the hotspot is really a fast method called billions of times. In other words profilers are not useful for hadoop. Than there are CPU samplers. Better for hadoop, not so much impact, but again distribted is hard. Also Samplers miss context in the sense that they look at thread stack traces without the context of what’s going on. And than there are modern APM solutions, that provide the best of both worlds and then some. These solutions can deliver the value of a profiler and sample without the overhead, can be distributed and provide context.
  • You can use these to look at high level hotspots of a job. E.g. this was a job that ran for 6 hours total across 10 servers in EC2. Now this does not show me every little detail, I don’t care about that. But it shows me the big hotspots, and for that it gives me detaul. E.g. that blue block 9 hours out of 65 hours accounted time
  • I can also go the other way around BTW, let’s say I see that my Cluster is spending a lot of time waiting, I can easyily figure out which jobs are running of course, but better, I can simply do a hotspot to check what my Task JVMs are doing, and then have the APM solution tell me which job, user this is.
  • Add Number of Tasks per Job, Job Percentage Tracking.
  • Map Phase and Reduce phase are the same time. Looked at slots, and the reducer is not using the full cluster, but also it can’t. reducing cannot scale as much as mapping. We also see that the reduce phase drops off at the end for the last hour or so.So while mapping consumes a lot more time, reducing is a bottleneck so every optimization there will count twice! Let’s keep that in mind.
  • From 58h of Mapping Time to 48 hours
  • One was thealreadymentionedregex.Another was thatweinitialized a SimpleDateFormater for every observation aka. Map. Now that was a big issue, because not only was it creating the object each time, it was getting the locale, reading the resource boundle, calculating the current date and much much more. Why did the dev do it? because SimpleDateformater is not thread safe, so you cannot make it static very simply. Anyway this single thing amount to about half of our CPU usage! A third thing was that we are parsing data among other things numbers. An empty string is not a number and thus leads to a number format exception which we handled. However the simple fact that millions of these exceptions were thrown and cought amounted to 10% of our CPU time.We fixed these 3 simple issues, and our reduce phase was 6 times faster. To put it in perspective it went from 3 hours to 30 minutes on top of the map phase!
  • The files we were working on comprised 5 mintues of data, aka ~500MB uncompressed and 50MB compressed. Our average map time was only about 3-5 minutes. While that is not horrible it still means that we have considerable startup overheadMap Time came down from 2:35 to 2:30 which isn’t much, but the actual job time did not change at all and remained at little over three hours?=
  • First of all we see that before and after we are fully CPU bound, but actually its not easy to see here, but utilization improved. We were on 95-97 for the mapping phase before and are now at 98-99. really awesome.
  • Leveraging your hadoop cluster better - running performant code at scale

    1. 1. Leveraging your Hadoopcluster betterrunning efficient code at scaleMichael Kopp, Technology Strategist
    2. 2. Why do I do this talk?2
    3. 3. Effectiveness vs. Efficiency• Effective: Adequate to accomplish a purpose; producing theintended or expected result1• Efficient: Performing or functioning in the best possiblemanner with the least waste of time and effort1…and resources1) http://www.dailyblogtips.com/effective-vs-efficient-difference/
    4. 4. An Efficient Hadoop Cluster• Is Effective  Gets the job done (in time)• Highly Utilized when Active (unused resources are wastedresources)
    5. 5. What is an efficient Hadoop Job?…efficiency is a measurable concept,quantitatively determinedby the ratio of output to input…• same output in less time• less resource usage with same outputand same time• more output with same resourcesin the same timeEfficient jobs are effective withoutadding more hardware!
    6. 6. Efficiency – Using everything we have
    7. 7. Utilization and DistributionCPU Spikes but noreal overall usageNot fullyutilized
    8. 8. Reasons why your Cluster is not utilized• Map and Reduce Slots• Data Distribution• Bottlenecks– Spill– Shuffle– Reduce– Code
    9. 9. Which Job(s) are dominating the cluster?
    10. 10. Which User? Which Pool?
    11. 11. Pushing the Boundaries – High Utilization• Figure out Spill and Shuffle Bottlenecks• Remove Idle Times, Wait Times, Sync Times• Hotspot Analysis Tools can pinpoint those Items quickly
    12. 12. Identify the Jobs
    13. 13. Job Bottlenecks – Mapping PhaseMapper is waitingfor Spill Threadio.sort.spill.percentio.sort.mbWait Time?
    14. 14. Job Bottleneck - ShuffleReducer is Waitingfor memorymapred.job.shuffle.input.buffer.percentmapred.job.reduce.total.mem.bytesmapred.reduce.parallel.copiesWait Time?
    15. 15. Cluster after simple “Fixes”
    16. 16. Jobs are now resource bound
    17. 17. Efficiency – Use what we have better
    18. 18. Performance Optimization1. Identify Bounding Resource2. Optimize and reduce its usage3. Identify new Bounding ResourceHot Spot Analysis Tools are again the best way to go
    19. 19. Identify Hotspots – which Phase
    20. 20. Cluster Usage
    21. 21. Mapping Phase Hotspot in Outage Analyzer70% our own code!
    22. 22. CPU Hotspot in Mapping Phase10% of Mapping CPU20% of Mapping CPU
    23. 23. Hotspot Analysis of Reduce PhaseWow!
    24. 24. Three simple Hotspots
    25. 25. Before Fix: 6h 30 minutes…
    26. 26. …After Fix: 3.5 hoursUtilization went up!
    27. 27. Map Reduce Run Comparison10% of Mapping CPUReducers Running3 Reducers running
    28. 28. Conclusion• Understanding your bottleneck!• Understand bounding resource• Small fixes can have huge yields…but requires tools
    29. 29. What else did we find?• Short Mappers due to small files– High merge time due to large number of spills– Too much data shuffle  add Combiner but…• Tried Task reuse– Nearly not effect?– 5% less Map Time, but…?
    30. 30. Why did the resuse not helpMap Phase over5 more reducersshuffle
    31. 31. What’s next?• Bigger Files• Add Combiners to reduce shuffle
    32. 32. What about Hive or PIG?• Identify which stage the is slow• Identify configuration Issues• Identify HBase or UDF issues
    33. 33. HBase PIG Job lasting for 15 hours…
    34. 34. HBase major Hotspot…Wow!Roundtrip for everysingle Row
    35. 35. Cluster Utilization after fix
    36. 36. Performance after Fix: 75 minutes!
    37. 37. Summary• Drive up utilization• Remove Blocks and Sync points• Optimize Big Hotspots
    38. 38. Michael Kopp, Technology Strategistmichael.kopp@compuware.com@mikoppapmblog.compuware.comjavabook.compuware.com