Page1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Hadoop YARN - 2015
June 9, 2015
Past, Present & Future
Page2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
We are
Vinod Kumar Vavilapalli
• Long time Hadooper since 2007
• Apache Hadoop Committer / PMC
• Apache Member
• Yahoo! -> Hortonworks
• MapReduce -> YARN from day one
Jian He
• Hadoop contributor since 2013
• Apache Hadoop Committer / PMC
• Hortonworks
• All things YARN
Page3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Overview
The Why and the What
Page4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Data architectures
• Traditional architectures
– Specialized systems - silos
– Per silo security, management, governance etc.
– Limited Scalability
– Limited cost efficiencies
• For the present and the future
– Hadoop repository
– Commodity storage
– Centralized but distributed system
– Scalable
– Uniform org policy enforcement
– Innovation across silos!
Data - HDFS
Cluster Resources
Page5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Resource Management
• Extracting value out of centralized data architecture
• A messy problem
– Multiple apps, frameworks, their life-cycles and evolution
• Tenancy
– “I am running this system for one user”
– It almost never stops there
– Groups, Teams, Users
• Sharing / isolation needed
• Adhoc structures get unusable real fast
Page6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Varied goals & expectations
• On isolation, capacity allocations, scheduling
Faster!
More! Best for my cluster
Throughput
Utilization
Elasticity
Service uptime
Security
ROIEverything! Right
now!
SLA!
Page7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Enter Hadoop YARN
HDFS (Scalable, Reliable Storage)
YARN (Cluster Resource Management)
Applications (Running Natively in Hadoop)
• Store all your data in one place … (HDFS)
• Interact with that data in multiple ways … (YARN Platform + Apps): Data centric
• Scale as you go, shared, multi-tenant, secure … (The Hadoop Stack)
Queues Admins/Users
Cluster Resources
Pipelines
Page8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop YARN
• Distributed System
• Host of frameworks, meta-frameworks, applications
• Varied workloads
– Batch
– Interactive
– Stream processing
– NoSQL databases
– ….
• Large scale
– Linear scalability
– Tens of thousands of nodes
– More coming
Page9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Past
A quick history
Page10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
A brief Timeline
• Sub-project of Apache Hadoop
• Releases tied to Hadoop releases
• Alphas and betas
– In production at several large sites for MapReduce already by that time
1st line of Code Open sourced First 2.0 alpha First 2.0 beta
June-July 2010 August 2011 May 2012 August 2013
Page11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
GA Releases
2.2 2.3 2.4 2.5
15 October 2013 24 February 2014 07 April 2014 11 August 2014
• 1st GA
• MR binary
compatibility
• YARN API
cleanup
• Testing!
• 1st Post GA
• Bug fixes
• Alpha features
• RM Fail-over
• CS Preemption
• Timeline
Service V1
• Writable REST
APIs
• Timeline
Service V1
security
Page12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Present
Page13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Last few Hadoop releases
• Hadoop 2.6
– 18 November , 2014
– Rolling Upgrades
– Services
– Node labels
• Hadoop 2.7
– 21 April, 2015
– 2.6 feature enhancements and bug fixes.
– Moving to JDK 7+
• Focus on some features next!
Apache Hadoop 2.6
Apache Hadoop 2.7
Page14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling Upgrades
Page15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN Rolling Upgrades
• Why? No more losing work during upgrades!
• Workflow
• Servers first: Masters followed by per-node agents
• Upgrade of Applications/Frameworks is decoupled!
• Work preserving RM restart: RM recovers
state from NMs and apps
• Work preserving NM restart: NM recovers
state from local disk
Page16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN Rolling Upgrades: A Cluster Snapshot
Page17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Stack Rolling Upgrades
Enterprise grade rolling upgrade of a Live Hadoop
Cluster
Jun 10, 3:25PM - 4:05PM
Sanjay Radia & Vinod K V from Hortonworks
Page18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Services on YARN
Page19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Long running services
• You could run them already
before 2.6!
• Enhancements needed
– Logs
– Security
– Fault tolerance – Application Masters
– Service Discovery - YARN registry service
Page20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Project Slider
• Bring your existing services unmodified to YARN: slider.incubator.apache.org/
• HBase, Storm, Kafka already!
YARN
MapReduce
Storm Kafka
Spark
HBasePig Hive Cascading
Apache Slider
More
services..
DeathStar: Easy, Dynamic, Multi-tenant HBase via
YARN
June 11: 1:30-2:10PM
Ishan Chhabra & Nitin Aggarwal from Rocket Fuel
Tez
Authoring and hosting applications on YARN using
Slider
Jun 11, 11:00AM - 11:40AM
Sumit Mohanty & Jonathan Maron from Hortonworks
Page21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operational and Developer tooling
Page22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Node Labels
• Today: Partitions
– Admin: “I have machines of different types”
– User: “I want to run job only on one type”
– Impact on Cluster sharing model
• Types
– Exclusive: “This is my Precious!”
– Non-exclusive: “I get binding preference. Use it
for others when idle”
Default Partition
Partition B
GPUs
Partition C
Windows
JDK 8 JDK 7 JDK 7
Node Labels in YARN
Jun 11, 11:00AM - 11:40AM
Mayank Bansal (ebay) & Wangda Tan (Hortonworks)
Page23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Pluggable ACLs
• Pluggable YARN authorization model
• YARN Apache Ranger integration
Apache Ranger
Queue ACLs
Management
2. Submit app
1. Admin manages ACLs
YARN
Securing Hadoop with Apache Ranger : Strategies & Best
Practices
Jun 11, 3:10PM - 3:50PM
Selvamohan Neethiraj & Velmurugan Periasamy from
HortonWorks
Page24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Usability
• “Why is my application stuck?”
• “How many rack local containers did I get”
• Lots more..
– “Why is my application stuck? What limits did it hit?”
– “What is the number of running containers of my app?”
– “How healthy is the scheduler?”
Page25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Future
Page26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Per-queue Policy-driven scheduling
Previously Now
Ingestion
FIFO
Adhoc
User-fairness
Adhoc
FIFO
Ingestion
FIFO
• Coarse policies
• One scheduling algorithm in the cluster
• Rigid
• Difficult to experiment
• Fine grained policies
• One scheduling algorithm per queue
• Flexible
• Very easy to experiment!
Batch
FIFO
Batch
FIFO
root
root
Enabling diverse workload scheduling in YARN
June 10 1:45PM – 2:25PM
Wangda Tan & Craig Welch (Hortonworks)
Page27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Reservations
• “Run my workload tomorrow at 6AM”
Timeline
Resources
6:00AM
Block #1
Timeline
Resources
6:00AM
Block #1
Block #2
Reservation-based Scheduling: If You’re Late Don’t Blame
Us!
June 10 12:05PM – 12:45PM
Carlo Curino & Subru Venkatraman Krishnan (Microsoft)
10:00AM
10:00AM 1:00PM 5:00PM
Page28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Containerized Applications
• Running Docker containers on YARN
– As a packaging mechanism
– As a resource-isolation mechanism
• Multiple use-cases
– “Run my existing service on YARN via Slider + Docker”
– “Run my existing MapReduce application on YARN via a docker image”
Apache Hadoop YARN and the Docker Ecosystem
June 9 1:45PM – 2:25PM
Sidharta Seethana (Hortonworks) & Abin Shahab
(Altiscale)
Page29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Disk Isolation
• Isolation and scheduling dimensions
– Disk Capacity
– IOPs
– Bandwidth
DataNode NodeManager Map Task
HBase
RegionServer
Disks on a node
Reduce
Task
• Read
• Write
• Localization
• Logs
• Shuffle
• Read
• Write
• Read Spills
• Write shuffled data
• Read Spills
• Write
Remote IO
• Today: Equal allocation to all containers along
all dimensions
• Next: Scheduling
Page30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Network Isolation
• Isolation and scheduling dimensions
– Incoming bandwidth
– Outgoing bandwidth
DataNode NodeManager Map TaskStorm Spout
Reduce
Task
• Write
Pipeline
• Localization
• Logs
• Shuffle
• Read • Read shuffled data
• Write outputs
• Read
input
Remote IO
• Today: Equi-share Outbound bandwidth
• Next: Scheduling
Network
Storm
Bolt
• Read
• Write
Page31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Timeline Service
• Application History
– “Where did my containers run?”
– MapReduce specific Job History Server
– Need a generic solution beyond
ResourceManager Restart
• Cluster History
– Run analytics on historical apps!
– “User with most resource utilization”
– “Largest application run”
• Running Application’s Timeline
– Framework specific event collection and UIs
– “Show me the Counters for my running
MapReduce task”
– “Show me the slowest Storm stream
processing bolt while it is running”
• What exists today
– A LevelDB based implementation
– Integrated into MapReduce, Apache Tez,
Apache Hive
Page32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Timeline Service 2.0
• Next generation
– Today’s solution helped us understand the space
– Limited scalability and availability
• “Analyzing Hadoop Clusters is becoming a big-data problem”
– Don’t want to throw away the Hadoop application metadata
– Large scale
– Enable near real-time analysis: “Find me the user who is hammering the FileSystem with rouge applications.
Now.”
• Timeline data stored in HBase and accessible to queries
Page33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Improved Usability
• With Timeline Service
– “Why is my application slow?”
– “Is it really slow?”
– “Why is my application failing?”
– “What happened with my application?
Succeeded?”
– “Why is my cluster slow?”
– “Why is my cluster down?”
– “What happened in my clusters?”
• Collect and use past data
– To schedule “my application” better
– To do better capacity planning
Page34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
More..
• Application priorities within a
queue
• YARN Federation – 100K+
nodes
• Unified statistics collection per
node
• Node anti-affinity
– “Do not run two copies of my service daemon
on the same machine”
• Gang scheduling
– “Run all of my app at once”
• Log aggregation scalability
• Dynamic scheduling based on
actual containers’ utilization
• Unified placement policies
• Prioritized queues
– Admin’s queue takes precedence over
everything else
• Lot more ..
– HDFS on YARN
– Global scheduling
– User level preemption
– Container resizing
Page35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Thank you!
Page36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Addendum
Page37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Work preserving ResourceManager restart
• ResourceManager remembers some state
• Reconstructs the remaining from nodes and apps
Page38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Work preserving NodeManager restart
• NodeManager remembers state on each machine
• Reconnects to running containers
Page39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
ResourceManager Fail-over
• Active/Standby based fail-over
• Depends on fast-recovery

Apache Hadoop YARN: Past, Present and Future

  • 1.
    Page1 © HortonworksInc. 2011 – 2015. All Rights Reserved Apache Hadoop YARN - 2015 June 9, 2015 Past, Present & Future
  • 2.
    Page2 © HortonworksInc. 2011 – 2015. All Rights Reserved We are Vinod Kumar Vavilapalli • Long time Hadooper since 2007 • Apache Hadoop Committer / PMC • Apache Member • Yahoo! -> Hortonworks • MapReduce -> YARN from day one Jian He • Hadoop contributor since 2013 • Apache Hadoop Committer / PMC • Hortonworks • All things YARN
  • 3.
    Page3 © HortonworksInc. 2011 – 2015. All Rights Reserved Overview The Why and the What
  • 4.
    Page4 © HortonworksInc. 2011 – 2015. All Rights Reserved Data architectures • Traditional architectures – Specialized systems - silos – Per silo security, management, governance etc. – Limited Scalability – Limited cost efficiencies • For the present and the future – Hadoop repository – Commodity storage – Centralized but distributed system – Scalable – Uniform org policy enforcement – Innovation across silos! Data - HDFS Cluster Resources
  • 5.
    Page5 © HortonworksInc. 2011 – 2015. All Rights Reserved Resource Management • Extracting value out of centralized data architecture • A messy problem – Multiple apps, frameworks, their life-cycles and evolution • Tenancy – “I am running this system for one user” – It almost never stops there – Groups, Teams, Users • Sharing / isolation needed • Adhoc structures get unusable real fast
  • 6.
    Page6 © HortonworksInc. 2011 – 2015. All Rights Reserved Varied goals & expectations • On isolation, capacity allocations, scheduling Faster! More! Best for my cluster Throughput Utilization Elasticity Service uptime Security ROIEverything! Right now! SLA!
  • 7.
    Page7 © HortonworksInc. 2011 – 2015. All Rights Reserved Enter Hadoop YARN HDFS (Scalable, Reliable Storage) YARN (Cluster Resource Management) Applications (Running Natively in Hadoop) • Store all your data in one place … (HDFS) • Interact with that data in multiple ways … (YARN Platform + Apps): Data centric • Scale as you go, shared, multi-tenant, secure … (The Hadoop Stack) Queues Admins/Users Cluster Resources Pipelines
  • 8.
    Page8 © HortonworksInc. 2011 – 2015. All Rights Reserved Hadoop YARN • Distributed System • Host of frameworks, meta-frameworks, applications • Varied workloads – Batch – Interactive – Stream processing – NoSQL databases – …. • Large scale – Linear scalability – Tens of thousands of nodes – More coming
  • 9.
    Page9 © HortonworksInc. 2011 – 2015. All Rights Reserved Past A quick history
  • 10.
    Page10 © HortonworksInc. 2011 – 2015. All Rights Reserved A brief Timeline • Sub-project of Apache Hadoop • Releases tied to Hadoop releases • Alphas and betas – In production at several large sites for MapReduce already by that time 1st line of Code Open sourced First 2.0 alpha First 2.0 beta June-July 2010 August 2011 May 2012 August 2013
  • 11.
    Page11 © HortonworksInc. 2011 – 2015. All Rights Reserved GA Releases 2.2 2.3 2.4 2.5 15 October 2013 24 February 2014 07 April 2014 11 August 2014 • 1st GA • MR binary compatibility • YARN API cleanup • Testing! • 1st Post GA • Bug fixes • Alpha features • RM Fail-over • CS Preemption • Timeline Service V1 • Writable REST APIs • Timeline Service V1 security
  • 12.
    Page12 © HortonworksInc. 2011 – 2015. All Rights Reserved Present
  • 13.
    Page13 © HortonworksInc. 2011 – 2015. All Rights Reserved Last few Hadoop releases • Hadoop 2.6 – 18 November , 2014 – Rolling Upgrades – Services – Node labels • Hadoop 2.7 – 21 April, 2015 – 2.6 feature enhancements and bug fixes. – Moving to JDK 7+ • Focus on some features next! Apache Hadoop 2.6 Apache Hadoop 2.7
  • 14.
    Page14 © HortonworksInc. 2011 – 2015. All Rights Reserved Rolling Upgrades
  • 15.
    Page15 © HortonworksInc. 2011 – 2015. All Rights Reserved YARN Rolling Upgrades • Why? No more losing work during upgrades! • Workflow • Servers first: Masters followed by per-node agents • Upgrade of Applications/Frameworks is decoupled! • Work preserving RM restart: RM recovers state from NMs and apps • Work preserving NM restart: NM recovers state from local disk
  • 16.
    Page16 © HortonworksInc. 2011 – 2015. All Rights Reserved YARN Rolling Upgrades: A Cluster Snapshot
  • 17.
    Page17 © HortonworksInc. 2011 – 2015. All Rights Reserved Stack Rolling Upgrades Enterprise grade rolling upgrade of a Live Hadoop Cluster Jun 10, 3:25PM - 4:05PM Sanjay Radia & Vinod K V from Hortonworks
  • 18.
    Page18 © HortonworksInc. 2011 – 2015. All Rights Reserved Services on YARN
  • 19.
    Page19 © HortonworksInc. 2011 – 2015. All Rights Reserved Long running services • You could run them already before 2.6! • Enhancements needed – Logs – Security – Fault tolerance – Application Masters – Service Discovery - YARN registry service
  • 20.
    Page20 © HortonworksInc. 2011 – 2015. All Rights Reserved Project Slider • Bring your existing services unmodified to YARN: slider.incubator.apache.org/ • HBase, Storm, Kafka already! YARN MapReduce Storm Kafka Spark HBasePig Hive Cascading Apache Slider More services.. DeathStar: Easy, Dynamic, Multi-tenant HBase via YARN June 11: 1:30-2:10PM Ishan Chhabra & Nitin Aggarwal from Rocket Fuel Tez Authoring and hosting applications on YARN using Slider Jun 11, 11:00AM - 11:40AM Sumit Mohanty & Jonathan Maron from Hortonworks
  • 21.
    Page21 © HortonworksInc. 2011 – 2015. All Rights Reserved Operational and Developer tooling
  • 22.
    Page22 © HortonworksInc. 2011 – 2015. All Rights Reserved Node Labels • Today: Partitions – Admin: “I have machines of different types” – User: “I want to run job only on one type” – Impact on Cluster sharing model • Types – Exclusive: “This is my Precious!” – Non-exclusive: “I get binding preference. Use it for others when idle” Default Partition Partition B GPUs Partition C Windows JDK 8 JDK 7 JDK 7 Node Labels in YARN Jun 11, 11:00AM - 11:40AM Mayank Bansal (ebay) & Wangda Tan (Hortonworks)
  • 23.
    Page23 © HortonworksInc. 2011 – 2015. All Rights Reserved Pluggable ACLs • Pluggable YARN authorization model • YARN Apache Ranger integration Apache Ranger Queue ACLs Management 2. Submit app 1. Admin manages ACLs YARN Securing Hadoop with Apache Ranger : Strategies & Best Practices Jun 11, 3:10PM - 3:50PM Selvamohan Neethiraj & Velmurugan Periasamy from HortonWorks
  • 24.
    Page24 © HortonworksInc. 2011 – 2015. All Rights Reserved Usability • “Why is my application stuck?” • “How many rack local containers did I get” • Lots more.. – “Why is my application stuck? What limits did it hit?” – “What is the number of running containers of my app?” – “How healthy is the scheduler?”
  • 25.
    Page25 © HortonworksInc. 2011 – 2015. All Rights Reserved Future
  • 26.
    Page26 © HortonworksInc. 2011 – 2015. All Rights Reserved Per-queue Policy-driven scheduling Previously Now Ingestion FIFO Adhoc User-fairness Adhoc FIFO Ingestion FIFO • Coarse policies • One scheduling algorithm in the cluster • Rigid • Difficult to experiment • Fine grained policies • One scheduling algorithm per queue • Flexible • Very easy to experiment! Batch FIFO Batch FIFO root root Enabling diverse workload scheduling in YARN June 10 1:45PM – 2:25PM Wangda Tan & Craig Welch (Hortonworks)
  • 27.
    Page27 © HortonworksInc. 2011 – 2015. All Rights Reserved Reservations • “Run my workload tomorrow at 6AM” Timeline Resources 6:00AM Block #1 Timeline Resources 6:00AM Block #1 Block #2 Reservation-based Scheduling: If You’re Late Don’t Blame Us! June 10 12:05PM – 12:45PM Carlo Curino & Subru Venkatraman Krishnan (Microsoft) 10:00AM 10:00AM 1:00PM 5:00PM
  • 28.
    Page28 © HortonworksInc. 2011 – 2015. All Rights Reserved Containerized Applications • Running Docker containers on YARN – As a packaging mechanism – As a resource-isolation mechanism • Multiple use-cases – “Run my existing service on YARN via Slider + Docker” – “Run my existing MapReduce application on YARN via a docker image” Apache Hadoop YARN and the Docker Ecosystem June 9 1:45PM – 2:25PM Sidharta Seethana (Hortonworks) & Abin Shahab (Altiscale)
  • 29.
    Page29 © HortonworksInc. 2011 – 2015. All Rights Reserved Disk Isolation • Isolation and scheduling dimensions – Disk Capacity – IOPs – Bandwidth DataNode NodeManager Map Task HBase RegionServer Disks on a node Reduce Task • Read • Write • Localization • Logs • Shuffle • Read • Write • Read Spills • Write shuffled data • Read Spills • Write Remote IO • Today: Equal allocation to all containers along all dimensions • Next: Scheduling
  • 30.
    Page30 © HortonworksInc. 2011 – 2015. All Rights Reserved Network Isolation • Isolation and scheduling dimensions – Incoming bandwidth – Outgoing bandwidth DataNode NodeManager Map TaskStorm Spout Reduce Task • Write Pipeline • Localization • Logs • Shuffle • Read • Read shuffled data • Write outputs • Read input Remote IO • Today: Equi-share Outbound bandwidth • Next: Scheduling Network Storm Bolt • Read • Write
  • 31.
    Page31 © HortonworksInc. 2011 – 2015. All Rights Reserved Timeline Service • Application History – “Where did my containers run?” – MapReduce specific Job History Server – Need a generic solution beyond ResourceManager Restart • Cluster History – Run analytics on historical apps! – “User with most resource utilization” – “Largest application run” • Running Application’s Timeline – Framework specific event collection and UIs – “Show me the Counters for my running MapReduce task” – “Show me the slowest Storm stream processing bolt while it is running” • What exists today – A LevelDB based implementation – Integrated into MapReduce, Apache Tez, Apache Hive
  • 32.
    Page32 © HortonworksInc. 2011 – 2015. All Rights Reserved Timeline Service 2.0 • Next generation – Today’s solution helped us understand the space – Limited scalability and availability • “Analyzing Hadoop Clusters is becoming a big-data problem” – Don’t want to throw away the Hadoop application metadata – Large scale – Enable near real-time analysis: “Find me the user who is hammering the FileSystem with rouge applications. Now.” • Timeline data stored in HBase and accessible to queries
  • 33.
    Page33 © HortonworksInc. 2011 – 2015. All Rights Reserved Improved Usability • With Timeline Service – “Why is my application slow?” – “Is it really slow?” – “Why is my application failing?” – “What happened with my application? Succeeded?” – “Why is my cluster slow?” – “Why is my cluster down?” – “What happened in my clusters?” • Collect and use past data – To schedule “my application” better – To do better capacity planning
  • 34.
    Page34 © HortonworksInc. 2011 – 2015. All Rights Reserved More.. • Application priorities within a queue • YARN Federation – 100K+ nodes • Unified statistics collection per node • Node anti-affinity – “Do not run two copies of my service daemon on the same machine” • Gang scheduling – “Run all of my app at once” • Log aggregation scalability • Dynamic scheduling based on actual containers’ utilization • Unified placement policies • Prioritized queues – Admin’s queue takes precedence over everything else • Lot more .. – HDFS on YARN – Global scheduling – User level preemption – Container resizing
  • 35.
    Page35 © HortonworksInc. 2011 – 2015. All Rights Reserved Thank you!
  • 36.
    Page36 © HortonworksInc. 2011 – 2015. All Rights Reserved Addendum
  • 37.
    Page37 © HortonworksInc. 2011 – 2015. All Rights Reserved Work preserving ResourceManager restart • ResourceManager remembers some state • Reconstructs the remaining from nodes and apps
  • 38.
    Page38 © HortonworksInc. 2011 – 2015. All Rights Reserved Work preserving NodeManager restart • NodeManager remembers state on each machine • Reconnects to running containers
  • 39.
    Page39 © HortonworksInc. 2011 – 2015. All Rights Reserved ResourceManager Fail-over • Active/Standby based fail-over • Depends on fast-recovery

Editor's Notes

  • #8 Queues reflect org structures. Hierarchical in nature.