0
Hadoop-2 @ebay
Mayank Bansal
ebay
Hadoop – 2 @ ebay
Mayank Bansal
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Who I am
• Principal Engineer @ ebay
• Apache Hadoop Committer
• Apache Oozie PMC and Committer
• Current
• Leading Hadoop...
Who we are
• ebay Hadoop Team
• We are around 40 people developing
and supporting Hadoop
• Thousands of Hadoop Users @ ebay
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Hadoop Evolution @ ebay
2007
1-10 nodes
2010
100+ nodes
1000s + cores
1 PB
2011
1000+ node
10,000+ cores
10+ PB
2012
3000+...
Hadoop - 1 Architecture
Hadoop-1 Limitations
• Scalability
• Maximum Cluster Size 4-5K nodes
• Maximum concurrent tasks ~40K
• Job Tracker scalabi...
Hadoop-2
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, streaming
YARN
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Application Master
• Runs on Normal Node Manager machines
• Out Of Memory Errors
• Slow Machines
• Flaky Network
Application Master
Nodes Goes Down
• Map Reduce
• Can Build state from Job History Files
• Generic Applications
• Applicat...
Application Master
• Slow Machines
• Automation/Monitoring
• Flaky Network
• Split Brain problem
• Fixed for Map Reduce
• ...
Application Master
Out Of Memory
• Physical Memory Errors
• yarn.app.mapreduce.am.resource.mb
• yarn.app.mapreduce.am.comm...
Binary Compatibility
• Works well
• mapred apis are binary compatible
• mapreduce apis are source compatible
• BUT …
• Onl...
Binary Compatibility
LZO Compression
• LZO is not compiled with Hadoop-2
Avro
• http://repo1.maven.org/maven2
• Version =>...
Log Aggregation
• Loads lot of data in HDFS
• Per Day 5-7 TB of Data
• Default is 30 days we made that to 4 days
• yarn.lo...
User Engagement
• Engage all users for verifying jobs
• Test with Production like data
• Verify all jobs just not the samp...
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Benchmarks
Benchmark Hadoop-1 Hadoop-2 Improvement
Sort 500 seconds 365 seconds ~20%
Tera Sort 182 seconds 180 seconds Abo...
Hadoop-2 Numbers
0
100000
200000
300000
400000
500000
600000
700000
Tasks Starting per Hour
Hadoop-2 Hadoop-1
0
100000
200...
Hadoop-2 Numbers
0
100
200
300
400
500
600
Apps Submitted per hour
Hadoop-2 Hadoop-1
0
100
200
300
400
500
600
Apps Finish...
Hadoop-2 Numbers
0
0.2
0.4
0.6
0.8
1
1.2
0:00
0:35
1:10
1:45
2:20
2:55
3:30
4:05
4:40
5:15
5:50
6:25
7:00
7:35
8:10
8:45
9...
Overall improvements
• Over All Job throughput
• increased ~2X
• Over All Run time of jobs
• Increased ~1.5X to 2X
Apps Beyond MapReduce
• Tez
• Storm
• Shark and Spark
• …
Availability
• Namenode HA
• RM Restart
• RM HA
• Rolling upgrades (Coming soon)
Conclusion
• There are some pain points.
• Need to plan User Testing
• Worth The Effort
Questions
30
Mayank Bansal
mabansal@ebay.com
mayank@apache.org
Hadoop-2 @ eBay
Upcoming SlideShare
Loading in...5
×

Hadoop-2 @ eBay

855

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
855
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
66
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Hadoop-2 @ eBay"

  1. 1. Hadoop-2 @ebay Mayank Bansal ebay
  2. 2. Hadoop – 2 @ ebay Mayank Bansal
  3. 3. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  4. 4. Who I am • Principal Engineer @ ebay • Apache Hadoop Committer • Apache Oozie PMC and Committer • Current • Leading Hadoop Core Development for YARN and MapReduce @ ebay • Past • Working on Scheduler / Resource Managers • Working on Distributed Systems • Data Pipeline frameworks Mayank Bansal
  5. 5. Who we are • ebay Hadoop Team • We are around 40 people developing and supporting Hadoop • Thousands of Hadoop Users @ ebay
  6. 6. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  7. 7. Hadoop Evolution @ ebay 2007 1-10 nodes 2010 100+ nodes 1000s + cores 1 PB 2011 1000+ node 10,000+ cores 10+ PB 2012 3000+ node 30,000+ cores 50+ PB 2013/2014 10,000 nodes 150,000+ cores 150 PB 2009 50+ nodes
  8. 8. Hadoop - 1 Architecture
  9. 9. Hadoop-1 Limitations • Scalability • Maximum Cluster Size 4-5K nodes • Maximum concurrent tasks ~40K • Job Tracker scalability • Availability • Failure kills all the jobs • Hard partition on Maps and Reduce • Less Cluster utilization • Lack support for alternate Paradigms
  10. 10. Hadoop-2 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, streaming
  11. 11. YARN
  12. 12. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  13. 13. Application Master • Runs on Normal Node Manager machines • Out Of Memory Errors • Slow Machines • Flaky Network
  14. 14. Application Master Nodes Goes Down • Map Reduce • Can Build state from Job History Files • Generic Applications • Application Time Line/History Server • YARN-321 • YARN-1530
  15. 15. Application Master • Slow Machines • Automation/Monitoring • Flaky Network • Split Brain problem • Fixed for Map Reduce • All the AppMasters have to fix this
  16. 16. Application Master Out Of Memory • Physical Memory Errors • yarn.app.mapreduce.am.resource.mb • yarn.app.mapreduce.am.command-opts • Virtual Memory Errors • Default Ratio 2.1, needs to be tweaked • yarn.nodemanager.vmem-check-enabled • yarn.nodemanager.vmem-pmem-ratio
  17. 17. Binary Compatibility • Works well • mapred apis are binary compatible • mapreduce apis are source compatible • BUT … • Only works for 70% Applications • Why? • Reflections • Uber Jars in class path • MAPREDUCE-5108
  18. 18. Binary Compatibility LZO Compression • LZO is not compiled with Hadoop-2 Avro • http://repo1.maven.org/maven2 • Version => 1.7.4-hadoop2
  19. 19. Log Aggregation • Loads lot of data in HDFS • Per Day 5-7 TB of Data • Default is 30 days we made that to 4 days • yarn.log-aggregation.retain-seconds • Lot of load on Namenode
  20. 20. User Engagement • Engage all users for verifying jobs • Test with Production like data • Verify all jobs just not the sample jobs
  21. 21. Agenda • Who we are? • Background of Hadoop and Hadoop at ebay • What are the challenges • What we achieved using Hadoop-2
  22. 22. Benchmarks Benchmark Hadoop-1 Hadoop-2 Improvement Sort 500 seconds 365 seconds ~20% Tera Sort 182 seconds 180 seconds About the same Shuffle 993 seconds 530 seconds ~2X Scalability 1020 seconds 275 seconds ~4X YARN-938
  23. 23. Hadoop-2 Numbers 0 100000 200000 300000 400000 500000 600000 700000 Tasks Starting per Hour Hadoop-2 Hadoop-1 0 100000 200000 300000 400000 500000 600000 700000 Tasks Finishing Per Hour Hadoop-2 Hadoop-1 ~59% more tasks ~52% more tasks
  24. 24. Hadoop-2 Numbers 0 100 200 300 400 500 600 Apps Submitted per hour Hadoop-2 Hadoop-1 0 100 200 300 400 500 600 Apps Finishing Per Hour Hadoop-2 Hadoop-1 ~51% more tasks ~50% more tasks
  25. 25. Hadoop-2 Numbers 0 0.2 0.4 0.6 0.8 1 1.2 0:00 0:35 1:10 1:45 2:20 2:55 3:30 4:05 4:40 5:15 5:50 6:25 7:00 7:35 8:10 8:45 9:20 9:55 10:30 11:05 11:40 12:15 12:50 13:25 14:00 14:35 15:10 15:45 16:20 16:55 17:30 18:05 18:40 19:15 19:50 20:25 21:00 21:35 22:10 22:45 23:20 23:55 Hadoop-2 Cluster Utilization Utilization
  26. 26. Overall improvements • Over All Job throughput • increased ~2X • Over All Run time of jobs • Increased ~1.5X to 2X
  27. 27. Apps Beyond MapReduce • Tez • Storm • Shark and Spark • …
  28. 28. Availability • Namenode HA • RM Restart • RM HA • Rolling upgrades (Coming soon)
  29. 29. Conclusion • There are some pain points. • Need to plan User Testing • Worth The Effort
  30. 30. Questions 30 Mayank Bansal mabansal@ebay.com mayank@apache.org
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×