Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN Based Hadoop
Clusters in the Cloud
Yarn and Hadoop Team,
Hadoop at Qubole
● Over 300 Petabytes data processed per month.
● More than 100 customers with more than 1000 active users.
● Over 1 million Hadoop jobs completed per month.
● More than 8,000 Hadoop clusters brought up per month.
Ephemeral Hadoop Clusters
Bring up Cluster Perform Jobs Terminate Cluster
• Use cloud storage for job output and input.
• Needs to auto-scale as per work-load.
• Store job history and logs at persistent location.
• Adapting YARN/HDFS to take into account ephemeral cloud nodes.
Challenges: Ephemeral Hadoop Clusters
• Containers contains output of Map tasks.
• Can not be terminated until Map output is consumed.
• Upload Map output to cloud.
• Reducers access Map output directly from cloud.
Further Optimizations in Down-scaling
• DFS used and incoming data rate is monitored periodically.
• Upscale if free DFS goes below an absolute threshold.
• Upscale if free DFS is projected to go below absolute threshold in next few
HDFS Based Up-Scaling
• AWS and Google Cloud provide volatile nodes termed as “Spot Nodes” or “Pre-
• Available at very low price as compared to stable nodes.
• Can be lost at any point of time without any prior notification.
• Hadoop’s failure resilience makes these nodes good candidates for Hadoop.
• Approx. 77% of all Qubole clusters make use of volatile nodes.
• While starting cluster, percentage of volatile nodes can be specified.
• A maximum ‘bid’ price for volatile nodes is also specified.
• Qubole Placement Policy:
– Ensures at least one replica of each HDFS block is present on Stable Node.
– No Application Master is scheduled on volatile nodes.
Volatile Nodes at Qubole
• While up-scaling, RM tries to maintain volatile node percentage.
• If volatile node are not available, fall back to stable nodes.
• Periodically tries to re-balance the volatile node percentage.
Rebalancing – Volatile Nodes
• Show job history for terminated clusters.
• Multi-tenant job history server.
• Clusters are generally running in isolated networks – need a proxy.
• Job History files needs to be stored at cloud storage.
Job History – Running cluster
Proxifies link in html
Job History – Terminated Cluster
file from cloud
• Writing output directly to cloud without storing at temporary location.
• Optimizations in getting file status for large number of files with common prefix.
• Added streaming upload support in NativeS3FileSystem.
• Added bulk delete and move support in NativeS3FileSystem.
Cloud Read/Write Optimizations
• Issues with newer version of JetS3t (0.9.4)
– Seek performance degraded around 10X.
– Empty files.
• Deadlock when number of threads reading from S3 exceeds JetS3t’s max number
of connections (HADOOP-12739).
• Too many queues causes a deadlock in cluster.(YARN-3633)
• Support for Socks Proxy was missing from HA.
Open Source Issues