Hadoop-2 @ eBay
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Hadoop-2 @ eBay

on

  • 843 views

 

Statistics

Views

Total Views
843
Views on SlideShare
840
Embed Views
3

Actions

Likes
1
Downloads
44
Comments
0

1 Embed 3

http://www.slideee.com 3

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hadoop-2 @ eBay Presentation Transcript

  • 1. Hadoop-2 @ebay Mayank Bansal ebay
  • 2. Hadoop – 2 @ ebay Mayank Bansal
  • 3. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  • 4. Who I am • Principal Engineer @ ebay • Apache Hadoop Committer • Apache Oozie PMC and Committer • Current • Leading Hadoop Core Development for YARN and MapReduce @ ebay • Past • Working on Scheduler / Resource Managers • Working on Distributed Systems • Data Pipeline frameworks Mayank Bansal
  • 5. Who we are • ebay Hadoop Team • We are around 40 people developing and supporting Hadoop • Thousands of Hadoop Users @ ebay
  • 6. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  • 7. Hadoop Evolution @ ebay 2007 1-10 nodes 2010 100+ nodes 1000s + cores 1 PB 2011 1000+ node 10,000+ cores 10+ PB 2012 3000+ node 30,000+ cores 50+ PB 2013/2014 10,000 nodes 150,000+ cores 150 PB 2009 50+ nodes
  • 8. Hadoop - 1 Architecture
  • 9. Hadoop-1 Limitations • Scalability • Maximum Cluster Size 4-5K nodes • Maximum concurrent tasks ~40K • Job Tracker scalability • Availability • Failure kills all the jobs • Hard partition on Maps and Reduce • Less Cluster utilization • Lack support for alternate Paradigms
  • 10. Hadoop-2 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, streaming
  • 11. YARN
  • 12. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  • 13. Application Master • Runs on Normal Node Manager machines • Out Of Memory Errors • Slow Machines • Flaky Network
  • 14. Application Master Nodes Goes Down • Map Reduce • Can Build state from Job History Files • Generic Applications • Application Time Line/History Server • YARN-321 • YARN-1530
  • 15. Application Master • Slow Machines • Automation/Monitoring • Flaky Network • Split Brain problem • Fixed for Map Reduce • All the AppMasters have to fix this
  • 16. Application Master Out Of Memory • Physical Memory Errors • yarn.app.mapreduce.am.resource.mb • yarn.app.mapreduce.am.command-opts • Virtual Memory Errors • Default Ratio 2.1, needs to be tweaked • yarn.nodemanager.vmem-check-enabled • yarn.nodemanager.vmem-pmem-ratio
  • 17. Binary Compatibility • Works well • mapred apis are binary compatible • mapreduce apis are source compatible • BUT … • Only works for 70% Applications • Why? • Reflections • Uber Jars in class path • MAPREDUCE-5108
  • 18. Binary Compatibility LZO Compression • LZO is not compiled with Hadoop-2 Avro • http://repo1.maven.org/maven2 • Version => 1.7.4-hadoop2
  • 19. Log Aggregation • Loads lot of data in HDFS • Per Day 5-7 TB of Data • Default is 30 days we made that to 4 days • yarn.log-aggregation.retain-seconds • Lot of load on Namenode
  • 20. User Engagement • Engage all users for verifying jobs • Test with Production like data • Verify all jobs just not the sample jobs
  • 21. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  • 22. Benchmarks Benchmark Hadoop-1 Hadoop-2 Improvement Sort 500 seconds 365 seconds ~20% Tera Sort 182 seconds 180 seconds About the same Shuffle 993 seconds 530 seconds ~2X Scalability 1020 seconds 275 seconds ~4X YARN-938
  • 23. Hadoop-2 Numbers 0 100000 200000 300000 400000 500000 600000 700000 Tasks Starting per Hour Hadoop-2 Hadoop-1 0 100000 200000 300000 400000 500000 600000 700000 Tasks Finishing Per Hour Hadoop-2 Hadoop-1 ~59% more tasks ~52% more tasks
  • 24. Hadoop-2 Numbers 0 100 200 300 400 500 600 Apps Submitted per hour Hadoop-2 Hadoop-1 0 100 200 300 400 500 600 Apps Finishing Per Hour Hadoop-2 Hadoop-1 ~51% more tasks ~50% more tasks
  • 25. Hadoop-2 Numbers 0 0.2 0.4 0.6 0.8 1 1.2 0:00 0:35 1:10 1:45 2:20 2:55 3:30 4:05 4:40 5:15 5:50 6:25 7:00 7:35 8:10 8:45 9:20 9:55 10:30 11:05 11:40 12:15 12:50 13:25 14:00 14:35 15:10 15:45 16:20 16:55 17:30 18:05 18:40 19:15 19:50 20:25 21:00 21:35 22:10 22:45 23:20 23:55 Hadoop-2 Cluster Utilization Utilization
  • 26. Overall improvements • Over All Job throughput • increased ~2X • Over All Run time of jobs • Increased ~1.5X to 2X
  • 27. Apps Beyond MapReduce • Tez • Storm • Shark and Spark • …
  • 28. Availability • Namenode HA • RM Restart • RM HA • Rolling upgrades (Coming soon)
  • 29. Conclusion • There are some pain points. • Need to plan User Testing • Worth The Effort
  • 30. Questions 30 Mayank Bansal mabansal@ebay.com mayank@apache.org