© Hortonworks Inc. 2011
Hadoop YARNSF Hadoop Users Meetup
Vinod Kumar Vavilapalli
vinodkv [at] { apache dot org | hortonwo...
© Hortonworks Inc. 2011
Myself
• 6.25 Hadoop-years old
• Previously at Yahoo!, @Hortonworks now.
• Last thing at college –...
© Hortonworks Inc. 2011
YARN: A new abstraction layer
HADOOP 1.0
HDFS
(redundant, reliable storage)
MapReduce
(cluster res...
© Hortonworks Inc. 2011
Concepts
Page 4
HDFS
YARN
MRv2 Tez
Platform
Applications &
Frameworks
Job #1 Job #2Jobs
© Hortonworks Inc. 2011
Concepts
• Platform
• Framework
• Application
–Application is a job submitted to the framework
–Ex...
© Hortonworks Inc. 2011
Architecture
Architecting the Future of Big Data
Page 6
© Hortonworks Inc. 2011
Hadoop MapReduce Classic
• JobTracker
–Manages cluster resources and job scheduling
• TaskTracker
...
© Hortonworks Inc. 2011
Current Limitations
• Scalability
–Maximum Cluster size – 4,000 nodes
–Maximum concurrent tasks – ...
© Hortonworks Inc. 2011
Current Limitations contd.
• Hard partition of resources into map and reduce slots
–Low resource u...
© Hortonworks Inc. 2011
Requirements
• Reliability
• Availability
• Utilization
• Wire Compatibility
• Agility & Evolution...
© Hortonworks Inc. 2011
Architecture: Philosophy
• General-purpose, distributed application framework
–Cannot scale monoli...
© Hortonworks Inc. 2011
Architecture
• Resource Manager
–Global resource scheduler
–Hierarchical queues
• Node Manager
–Pe...
© Hortonworks Inc. 2011
YARN Architecture
Page 13
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Apache Hadoop MapReduce on YARN
Page 14
Architecting the Future of Big Data
NodeManager NodeManage...
© Hortonworks Inc. 2011
Global Scheduler (ResourceManager)
• Resource arbitration
• Multiple resource dimensions
–<priorit...
© Hortonworks Inc. 2011
Scheduler Concepts
• Input from AM(s) is a dynamic list of ResourceRequests
–<resource-name, resou...
© Hortonworks Inc. 2011
Fault tolerance
• Task/container failures
– Application Masters should take care, it’s their busin...
© Hortonworks Inc. 2011
Writing your own apps
Architecting the Future of Big Data
Page 18
© Hortonworks Inc. 2011
Application Master
• Dynamically allocated per-application on startup
• Responsible for individual...
© Hortonworks Inc. 2011
Writing Custom Applications
• Grand total of 3 protocols
• ApplicationClientProtocol
–Application ...
© Hortonworks Inc. 2011
Other things to take care of
• Container/tasks
• Client
• UI
• Recovery
• Container -> AM communic...
© Hortonworks Inc. 2011
Libraries for app/framework writers
• YarnClient, AMRMClient, NMClient
• More projects:
– Higher l...
© Hortonworks Inc. 2011
Other goodies
• Rolling upgrades
• Multiple versions of MR at the same time
• Same scheduling algo...
© Hortonworks Inc. 2011
Existing applications
Architecting the Future of Big Data
Page 24
© Hortonworks Inc. 2011
Compatibility with Apache Hadoop 1.x
• org.apache.hadoop.mapred
– Add 1 property to your existing ...
© Hortonworks Inc. 2011
Any Performance Gains?
• Significant gains across the board!
• MapReduce
–Lots of runtime improvem...
© Hortonworks Inc. 2011
Testing?
• Testing, *lots* of it
• Benchmarks: Blog post soon
• Integration testing/ full-stack
–H...
© Hortonworks Inc. 2011
Deployment
• Beta last month
–Misnomer: 10s of PB of storage, on 0.23, a previous state of
YARN be...
© Hortonworks Inc. 2011
How do I get it?
Architecting the Future of Big Data
Page 29
© Hortonworks Inc. 2011
YARN beta releases
• Apache Hadoop Core 2.1.0-Beta
– Official beta release from Apache
– YARN APIs...
© Hortonworks Inc. 2011
Future
Architecting the Future of Big Data
Page 31
© Hortonworks Inc. 2011
Looking ahead
• YARN Improvements
• Alternate programming models: Apache Tez, Storm.
• Long(er) ru...
© Hortonworks Inc. 2011
Ecosystem
• Spark (UCB) on YARN
• Real-time data processing
–Storm (Twitter) on YARN
• Graph proce...
© Hortonworks Inc. 2011
Questions & Answers
TRY
download at hortonworks.com
LEARN
Hortonworks University
FOLLOW
twitter: @...
Upcoming SlideShare
Loading in …5
×

September SF Hadoop User Group 2013

1,893 views

Published on

YARN talk post beta given at September SF Hadoop User Group Meetup 2013. Meetup link http://www.meetup.com/hadoopsf/events/136499862/

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,893
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
124
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

September SF Hadoop User Group 2013

  1. 1. © Hortonworks Inc. 2011 Hadoop YARNSF Hadoop Users Meetup Vinod Kumar Vavilapalli vinodkv [at] { apache dot org | hortonworks dot com } @tshooter Page 1
  2. 2. © Hortonworks Inc. 2011 Myself • 6.25 Hadoop-years old • Previously at Yahoo!, @Hortonworks now. • Last thing at college – a two node tomcat cluster. Three months later, first thing at job, brought down a 800 node cluster ;) • Hadoop YARN lead. Apache Hadoop PMC, Apache Member • MapReduce, HadoopOnDemand, CapacityScheduler, Hadoop security • Ambari/Stinger/ random trouble shooting 2
  3. 3. © Hortonworks Inc. 2011 YARN: A new abstraction layer HADOOP 1.0 HDFS (redundant, reliable storage) MapReduce (cluster resource management & data processing) HDFS2 (redundant, reliable storage) YARN (cluster resource management) MapReduce (data processing) Others (data processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, … Page 3
  4. 4. © Hortonworks Inc. 2011 Concepts Page 4 HDFS YARN MRv2 Tez Platform Applications & Frameworks Job #1 Job #2Jobs
  5. 5. © Hortonworks Inc. 2011 Concepts • Platform • Framework • Application –Application is a job submitted to the framework –Example – Map Reduce Job • Container –Basic unit of allocation –Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU 5
  6. 6. © Hortonworks Inc. 2011 Architecture Architecting the Future of Big Data Page 6
  7. 7. © Hortonworks Inc. 2011 Hadoop MapReduce Classic • JobTracker –Manages cluster resources and job scheduling • TaskTracker –Per-node agent –Manage tasks 7
  8. 8. © Hortonworks Inc. 2011 Current Limitations • Scalability –Maximum Cluster size – 4,000 nodes –Maximum concurrent tasks – 40,000 –Coarse synchronization in JobTracker • Single point of failure –Failure kills all queued and running jobs –Jobs need to be re-submitted by users • Restart is very tricky due to complex state Page 8 Architecting the Future of Big Data
  9. 9. © Hortonworks Inc. 2011 Current Limitations contd. • Hard partition of resources into map and reduce slots –Low resource utilization • Lacks support for alternate paradigms –Iterative applications implemented using MapReduce are 10x slower –Hacks for the likes of MPI/Graph Processing • Lack of wire-compatible protocols –Client and cluster must be of same version –Applications and workflows cannot migrate to different clusters Page 9 Architecting the Future of Big Data
  10. 10. © Hortonworks Inc. 2011 Requirements • Reliability • Availability • Utilization • Wire Compatibility • Agility & Evolution – Ability for customers to control upgrades to the grid software stack. • Scalability - Clusters of 6,000-10,000 machines –Each machine with 16 cores, 48G/96G RAM, 24TB/36TB disks –100,000+ concurrent tasks –10,000 concurrent jobs Page 10 Architecting the Future of Big Data
  11. 11. © Hortonworks Inc. 2011 Architecture: Philosophy • General-purpose, distributed application framework –Cannot scale monolithic masters. Or monsters? –Distribute responsibilities • ResourceManager – Central scheduler –Only resource arbitration –No failure handling –Provide necessary information to AMs • Push everything possible responsibility to ApplicationMaster(s) –Don’t trust ApplicationMaster(s) –User land library! 11
  12. 12. © Hortonworks Inc. 2011 Architecture • Resource Manager –Global resource scheduler –Hierarchical queues • Node Manager –Per-machine agent –Manages the life-cycle of container –Container resource monitoring • Application Master –Per-application –Manages application scheduling and task execution –E.g. MapReduce Application Master 12
  13. 13. © Hortonworks Inc. 2011 YARN Architecture Page 13 Architecting the Future of Big Data
  14. 14. © Hortonworks Inc. 2011 Apache Hadoop MapReduce on YARN Page 14 Architecting the Future of Big Data NodeManager NodeManager NodeManager NodeManager map 1.1 reduce2.1 ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager map1.2 reduce1.1 MR AM 1 map2.1 map2.2 reduce2.2 MR AM2 Scheduler
  15. 15. © Hortonworks Inc. 2011 Global Scheduler (ResourceManager) • Resource arbitration • Multiple resource dimensions –<priority, data-locality, memory, cpu, …> • In-built support for data-locality –Node, Rack etc. –Unique to YARN. 15
  16. 16. © Hortonworks Inc. 2011 Scheduler Concepts • Input from AM(s) is a dynamic list of ResourceRequests –<resource-name, resource-capability> –Resource name: (hostname / rackname / any) –Resource capability: (memory, cpu, …) –Essentially an inverted <name, capability> request map from AM to RM –No notion of tasks! • Output - Container –Resource(s) grant on a specific machine –Verifiable allocation: via Container Tokens Page 16 Architecting the Future of Big Data
  17. 17. © Hortonworks Inc. 2011 Fault tolerance • Task/container failures – Application Masters should take care, it’s their business • Node failures – ResourceManager marks the nodes as failed, informs all the apps / Application Masters. AMs can chose to ignore failure or rerun work depending on what they want. • Application Master failures – ResourceManager restarts AMs that have failed. – One Application can have multiple ApplicationAttempts – Every ApplicationAttempt should store state, so that next ApplicationAttempt can recover from failure • ResourceManager failures – ResourceManager saves state, can do host/ip failover today. – Recovers state, but kills all current work as of now – Work preserving restart – HA Page 17 Architecting the Future of Big Data
  18. 18. © Hortonworks Inc. 2011 Writing your own apps Architecting the Future of Big Data Page 18
  19. 19. © Hortonworks Inc. 2011 Application Master • Dynamically allocated per-application on startup • Responsible for individual application scheduling and life- cycle management • Request and obtain containers for it’s tasks –Do a second-level schedule i.e. containers to component tasks –Start/stop containers on NodeManagers • Handle all task/container errors • Obtain resource hints/meta-information from RM for better scheduling –Peek-ahead into resource availability –Faulty resources (node, rack etc.) Page 19 Architecting the Future of Big Data
  20. 20. © Hortonworks Inc. 2011 Writing Custom Applications • Grand total of 3 protocols • ApplicationClientProtocol –Application launching program –submitApplication • ApplicationMasterProtocol –Protocol between AM & RM for resource allocation –registerApplication / allocate / finishApplication • ContainerManagementProtocol –Protocol between AM & NM for container start/stop –startContainer / stopContainer Page 20 Architecting the Future of Big Data
  21. 21. © Hortonworks Inc. 2011 Other things to take care of • Container/tasks • Client • UI • Recovery • Container -> AM communication • Application History Page 21 Architecting the Future of Big Data
  22. 22. © Hortonworks Inc. 2011 Libraries for app/framework writers • YarnClient, AMRMClient, NMClient • More projects: – Higher level APIs – Weave, REEF Page 22 Architecting the Future of Big Data
  23. 23. © Hortonworks Inc. 2011 Other goodies • Rolling upgrades • Multiple versions of MR at the same time • Same scheduling algorithms – Capacity, fairness • Secure from start • Locality for generic apps • Log aggregation • Everything on the same cluster Page 23 Architecting the Future of Big Data
  24. 24. © Hortonworks Inc. 2011 Existing applications Architecting the Future of Big Data Page 24
  25. 25. © Hortonworks Inc. 2011 Compatibility with Apache Hadoop 1.x • org.apache.hadoop.mapred – Add 1 property to your existing mapred-site.xml – mapreduce.framework.name = yarn – Continue submitting using bin/hadoop – Nothing Else Just Run Your MapReduce Jobs! • org.apache.hadoop.mapreduce – Generally run without changes, recompilation, or minor updates – If your existing apps fail recompile against the new MRv2 jars • Pig – Scripts built on Pig 10.1+ run without changes • Hive – Queries built on Hive 10.0+ run without changes • Streaming, Pipes, Oozie, Sqoop ….
  26. 26. © Hortonworks Inc. 2011 Any Performance Gains? • Significant gains across the board! • MapReduce –Lots of runtime improvements –Map side, reduce side –Better shuffle • So much better throughput • Y! can run lot more jobs on lesser number of nodes in lesser time More details: http://hortonworks.com/delivering-on-hadoop-next-benchmarking-performance/ Page 26 Architecting the Future of Big Data
  27. 27. © Hortonworks Inc. 2011 Testing? • Testing, *lots* of it • Benchmarks: Blog post soon • Integration testing/ full-stack –HBase –Pig –Hive –Oozie –… • Functional tests Page 27 Architecting the Future of Big Data
  28. 28. © Hortonworks Inc. 2011 Deployment • Beta last month –Misnomer: 10s of PB of storage, on 0.23, a previous state of YARN before 2.0 –Significantly wide variety of applications and load • GA –Very soon, less than a month away –Bugs, blockers only now Page 28 Architecting the Future of Big Data
  29. 29. © Hortonworks Inc. 2011 How do I get it? Architecting the Future of Big Data Page 29
  30. 30. © Hortonworks Inc. 2011 YARN beta releases • Apache Hadoop Core 2.1.0-Beta – Official beta release from Apache – YARN APIs are stable – Backwards compatible with MapReduce 1 jobs – Blocker bugs have been resolved • Features in HDP 2.0 Beta – Apache Ambari deploys YARN and Mapreduce 2 – Capacity Scheduler for YARN – Full stack tested Page 30
  31. 31. © Hortonworks Inc. 2011 Future Architecting the Future of Big Data Page 31
  32. 32. © Hortonworks Inc. 2011 Looking ahead • YARN Improvements • Alternate programming models: Apache Tez, Storm. • Long(er) running services (e.g. Hbase): Hoya • ResourceManager HA • Work-preserving restart of resourcemanager • Reconnect running containers to AMs • Gang scheduling • Multi-dimensional resources: CPU in. Disk (capacity, IOPS), network? Page 32 Architecting the Future of Big Data
  33. 33. © Hortonworks Inc. 2011 Ecosystem • Spark (UCB) on YARN • Real-time data processing –Storm (Twitter) on YARN • Graph processing – Apache Giraph on YARN • OpenMPI on YARN? • PAAS on YARN? • Yarnify: *. on YARN Page 33 Architecting the Future of Big Data
  34. 34. © Hortonworks Inc. 2011 Questions & Answers TRY download at hortonworks.com LEARN Hortonworks University FOLLOW twitter: @hortonworks Facebook: facebook.com/hortonworks MORE EVENTS hortonworks.com/events Page 34 Further questions & comments: events@hortonworks.com

×