Successfully reported this slideshow.

Hadoop Performance at LinkedIn

42

Share

Loading in …3
×
1 of 26
1 of 26

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Hadoop Performance at LinkedIn

  1. 1. Grid Operations Hadoop Performance at LinkedIn Allen Wittenauer Grid Computing Architect ©2012 LinkedIn Corporation. All Rights Reserved.
  2. 2. ©2012 LinkedIn Corporation. All Rights Reserved.
  3. 3. “I have never seen a Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  4. 4. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  5. 5. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  6. 6. “I have only seen one Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  7. 7. Why do we have such high CPU usage? ©2012 LinkedIn Corporation. All Rights Reserved.
  8. 8. We do a lot of Graph Theory. ©2012 LinkedIn Corporation. All Rights Reserved.
  9. 9. Ticket to Ride  Ticket To Ride is a registered trademark of Days of Wonder ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  10. 10. Social Graph ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  11. 11. 2nd Degree Connection ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  12. 12. We under-commit our memory. ©2012 LinkedIn Corporation. All Rights Reserved.
  13. 13. Our Hadoop Software Needs... The Plan...  Tasks – 2 GB of RAM = 1 GB of JVM Heap, .5-1GB for non-heap – (Typically) 1 Super Active Threads  TaskTracker – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  DataNode – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  RAM: 3GB + (task count * 2GB) + OS needs  Threads: 8 + (task count) + OS needs ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  14. 14. Our Hadoop Software Needs... The Reality  Task Counts – Westmere (5650): 6 Cores+HT = 12 Tasks – Sandy Bridge (2640): 6 Cores+HT = 14 Tasks  Most of our tasks leave at most .5 GB free – = combined -> very large buffer & cache ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  15. 15. We don’t have as many disks per node. ©2012 LinkedIn Corporation. All Rights Reserved.
  16. 16. Typical Hadoop Node Out in the Wild  Most user’s don’t know their actual needs – Vendor advice... play it safe!  Significantly more memory – “For the future!” – Badly written code  Significantly more disk – “Hadoop is IO intensive!” – “Greater task locality!”  Greater performance...but is it worth the cost... ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  17. 17. What Happens With Fewer Disks?  Physical footprint requirements are smaller  Linux buffers & caches are more efficient – More per disk – Fewer to manage  Spindle count DOES matter... but the price/perf isn’t there for our workflows. – From a few years ago & based on store.sun.com prices (so not “real”)... Nodes/Cores RAM/Bus Disks Time In Minutes HW Cost* 3/24 16/half 8 254.98 $37827 3/24 24/full 8 244.50 $38817 3/24 16/half 4 257.38 $21456 3/24 24/full 4 246.82 $22986 6/48 16/half 4 126.98 $42912 ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  18. 18. LinkedIn Node Configuration  No RAID controller – More cost for negative perf when doing JBOD  6 Drives – Still fits in 1U w/SATA drives – ~same perf as 8 drives  Less metal = cheaper cost ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  19. 19. Rack Level View  If we assume we can use 40u in a rack then: – More CPUs – Just as many HDs – More Network – Potentially more RAM ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  20. 20. We care about file system tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  21. 21. LinkedIn Hadoop Disk/File Systems  noatime Enabled  writeback Enabled  Each Disk (except root) Partitions: – Swap – MapReduce Spill Space – HDFS  Delayed Commits – Why write once when you can do ganged writes more efficiently? ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  22. 22. We care about job tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  23. 23. LinkedIn Job Tuning Guidelines  All jobs get reviewed prior to going to production.  Task times should be between 5-15 minutes.  Jobs should have less than 10,000 tasks.  Jobs should be smart about # of files and the size of those files generated. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  24. 24. ... and the result? ©2012 LinkedIn Corporation. All Rights Reserved.
  25. 25. Why is LinkedIn Running so Hot?  We do a lot of non-MapReduce work.  RAM buffers and caches allow us to offset a lot of disk IO.  We audit our jobs.  As a result, our CPUs are actually busy. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  26. 26. ©2012 LinkedIn Corporation. All Rights Reserved. BUSINESS OPERATIONS

×