Spark in YARNSpark in YARNSpark in YARNSpark in YARN
multimultimultimulti----tenant clusterstenant clusterstenant clusterstenant clustersmultimultimultimulti----tenant clusterstenant clusterstenant clusterstenant clusters
Pravin Mittal (pravinm@Microsoft.com)
Rajesh Iyer (riyer@Microsoft.com)
Spark in YARNSpark in YARNSpark in YARNSpark in YARN----managedmanagedmanagedmanaged
tenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusters
Pravin Mittal (pravinm@Microsoft.com)
Rajesh Iyer (riyer@Microsoft.com)
ark on Azure HDInsight
Managed Service
0% open source Apache Spark and Hadoop bits
est releases of Spark
ly supported by Microsoft and Hortonworks
.9% Azure Cloud SLA; 24/7 Managed Service
rtifications: PCI, ISO 27018, SOC, HIPAA, EU-MCrtifications: PCI, ISO 27018, SOC, HIPAA, EU-MC
mized for experimentation and development
pyter Notebooks (scala, python, automatic data visualizations)
elliJ plugin (job submission, remote debugging)
BC connector for Power BI, Tableau, Qlik, SAP, Excel,
ark on Azure HDInsight
mized for experimentation and development
, python, automatic data visualizations)
, SAP, Excel, etc
ake Spark Simple - Integrated with Azure
osystem
crosoft R Server - Multi-threaded math libraries and transparent parallelization in R Server means handling up to 1000x more d
x faster speeds than open source R. This is based on open source R, it does not require any change to R scripts
ure Data Lake Store – HDFS for the cloud, optimized for massive throughput, Ultra
ure Data Factory orchestrates Spark ETL pipeline
werBI connector for Spark for rich visualization.Newin Power BI is a streaming connector allowing you to publish realwerBI connector for Spark for rich visualization.Newin Power BI is a streaming connector allowing you to publish real
eaming directly to Power BI.
entsHub connector as a data source for Spark streaming
ure SQL Datawarehouse & Hbase connector for fast & scalable storage
Integrated with Azure
threaded math libraries and transparent parallelization in R Server means handling up to 1000x more data a
x faster speeds than open source R. This is based on open source R, it does not require any change to R scripts
HDFS for the cloud, optimized for massive throughput, Ultra-high capacity, Low Latency, Secure ACL support
in Power BI is a streaming connector allowing you to publish real-time events frin Power BI is a streaming connector allowing you to publish real-time events fr
connector for fast & scalable storage
pyter-Spark Integration via Livy
parkmagic is an open source library that Microsoft is incubating under the
housands of Spark clusters in production providing feedback to further improve the experience
https://github.com/jupyter
Spark Integration via Livy
is an open source library that Microsoft is incubating under the Jupyter Incubator progra
housands of Spark clusters in production providing feedback to further improve the experience
https://github.com/jupyter-incubator/sparkmagic
ark Execution Model
Each Spark Application is an instance of SparkCont
gets its own executor processes that has applicatio
lifetime
Spark is agnostic of Cluster manager as long it has
executor process that can communicate with each
The driver program must listen for and accept inco
connections from its executors throughout its lifet
Driver is responsible for scheduling tasks on the clu
hy Yarn as Cluster Manager?
soft, Cloudera, Hortonworks, IBM and many other are all actively working to
allows you to dynamically share and centrally co
een all frameworks that run on YARN.
is the only cluster manager for Spark that suppo
rized Hadoop clusters and uses secure authentication between its processes
allows us to have richer resource management policy
ws to maximize cluster utilization, fair resource sharing, dynamic pre-emption when running multiple concurrent application
ent resource guarantees for Batch and Interactive workload.
://blog.cloudera.com/blog/2014/05/apache-spark-resource
hy Yarn as Cluster Manager?
soft, Cloudera, Hortonworks, IBM and many other are all actively working to impove YARN
ntrally configure the same pool of cluster resource
at supports security. With YARN, Spark can run aga
Hadoop clusters and uses secure authentication between its processes.
allows us to have richer resource management policy
emption when running multiple concurrent application and also able to
resource-management-and-yarn-app-models/
RN AllocationModel for Spark
og.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/
SparkSubmit starts and talks to the
ResourceManager for the cluster
The ResourceManager makes a single container
request on behalf of the SparkSubmit
The ApplicationMaster starts running within that
container.
The ApplicationMaster requests subsequent
Model for Spark
The ApplicationMaster requests subsequent
containers for the Spark Executors from the
ResourceManager are allocated to run tasks for
the application.
For Spark Batch Applications, all the Spark
executor containers and Application master are
freed
For Spark interactive Applications (Dynamic
Executor enabled), Spark executors are freed
after idle timeout but Application master
remains till Spark driver exits.
nning Spark on YARN in HDInsight
Requirements
Maximize cluster utilization i.e. reduce idle resource
Fair resource sharing between different Spark
applicationsapplications
Resource guarantee
nning Spark on YARN in HDInsight
Maximize cluster utilization i.e. reduce idle resource
Fair resource sharing between different Spark
aximize cluster utilization
educe allocating idle resource
pplication should be able to use the entire cluster if necessary
hould be able to work with cluster scaling
What should be the ideal setting for the n
pplication
park static allocationpark static allocation
spark.executor.instances to a large value
park dynamic allocation
spark.dynamicAllocation.enabled = true
spark.dynamicAllocation.maxExecutors
ARN capacity scheduler queue
yarn.scheduler.capacity.<parent queue>.<child queue>.maximum
to 100
aximize cluster utilization
pplication should be able to use the entire cluster if necessary
hould be able to work with cluster scaling
for the number of executors for any Spark
to a large value
= true
spark.dynamicAllocation.maxExecutors to a large value
.<parent queue>.<child queue>.maximum-capac
ir resource sharing
oncurrent applications should be able to share resources
Use separate YARN capacity schedu
ontexts
Queues are statically created
Allocated resources are not shared between different Spark contextsAllocated resources are not shared between different Spark contexts
Need a way to reclaim allocated resources when another Spark context comes along
YARN preemption AND Spark dynamic allocation
Spark dynamic allocation gives up only idle resource
YARN preemption to reclaim in-use resource
(yarn.resourcemanager.scheduler.monitor.enable
yarn.resourcemanager.scheduler.monitor.policies
YARN preemption predictable with yarn.scheduler.capacity.resource
DefaultResourceCalculator
YARN JIRA YARN-4390
oncurrent applications should be able to share resources
ty scheduler queues for different Spa
Allocated resources are not shared between different Spark contextsAllocated resources are not shared between different Spark contexts
Need a way to reclaim allocated resources when another Spark context comes along
YARN preemption AND Spark dynamic allocation
Spark dynamic allocation gives up only idle resource
use resource
yarn.resourcemanager.scheduler.monitor.enable &
yarn.resourcemanager.scheduler.monitor.policies)
yarn.scheduler.capacity.resource-calculator =
ir resource sharing
Use separate Spark resource pools for same Spark context
Resource pools are dynamically created per context
Allocated resources are shared between different Spark jobs
No need to reclaim allocated resources when another Spark job comes along
Combination of the above to support concurrently runningCombination of the above to support concurrently running
Notebooks, Batch and BI workloads in the same cluster
Use separate Spark resource pools for same Spark context
Resource pools are dynamically created per context
Allocated resources are shared between different Spark jobs
No need to reclaim allocated resources when another Spark job comes along
Combination of the above to support concurrently runningCombination of the above to support concurrently running
Notebooks, Batch and BI workloads in the same cluster
source guarantee
very spark application should be able to run immediately
ombination
Separate YARN capacity queues with yarn.scheduler.capacity
queue>.capacity used to guarantee resources for different Spark applications
Separate Spark resource pools within the same Spark applicationSeparate Spark resource pools within the same Spark application
YARN preemption to ensure that in-use resource can be reclaimed
Spark dynamic allocation to ensure that idle resources can be reclaimed
very spark application should be able to run immediately
yarn.scheduler.capacity.<parent queue>.<child
used to guarantee resources for different Spark applications
Separate Spark resource pools within the same Spark applicationSeparate Spark resource pools within the same Spark application
use resource can be reclaimed
Spark dynamic allocation to ensure that idle resources can be reclaimed
orking configuration
park settings
Spark.executor.instances = <very large value>
Spark.dynamicAllocation.enabled = true
Spark.dynamicAllocation.initialExecutors = 0
Spark.dynamicAllocation.minExecutors = 0
Spark.dynamicAllocation.maxExecutors = <very large value>Spark.dynamicAllocation.maxExecutors = <very large value>
ARN settings
yarn.resourcemanager.scheduler.monitor.enable
yarn.resourcemanager.scheduler.monitor.policies
org.apache.hadoop.yarn.server.resourcemanager.
olicy
yarn.scheduler.capacity.resource-
calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
yarn.scheduler.capacity.root.queues=default,<n queues>
Yarn.scheduler.capacity.<parent_queue>.<child_queue
Yarn.scheduler.capacity.<parent_queue>.<child_queue
OR
= <very large value>= <very large value>
yarn.resourcemanager.scheduler.monitor.enable = true
yarn.resourcemanager.scheduler.monitor.policies =
manager.monitor.capacity.ProportionalCapacityPreemp
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
=default,<n queues>
child_queue>.capacity
child_queue>.maximum_capacity
DEMODEMODEMODEMO

Spark in YARN-managed Multi-tenant Clusters by Pravin Mittal and Rajesh Iyer

  • 1.
    Spark in YARNSparkin YARNSpark in YARNSpark in YARN multimultimultimulti----tenant clusterstenant clusterstenant clusterstenant clustersmultimultimultimulti----tenant clusterstenant clusterstenant clusterstenant clusters Pravin Mittal (pravinm@Microsoft.com) Rajesh Iyer (riyer@Microsoft.com) Spark in YARNSpark in YARNSpark in YARNSpark in YARN----managedmanagedmanagedmanaged tenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusterstenant clusters Pravin Mittal (pravinm@Microsoft.com) Rajesh Iyer (riyer@Microsoft.com)
  • 2.
    ark on AzureHDInsight Managed Service 0% open source Apache Spark and Hadoop bits est releases of Spark ly supported by Microsoft and Hortonworks .9% Azure Cloud SLA; 24/7 Managed Service rtifications: PCI, ISO 27018, SOC, HIPAA, EU-MCrtifications: PCI, ISO 27018, SOC, HIPAA, EU-MC mized for experimentation and development pyter Notebooks (scala, python, automatic data visualizations) elliJ plugin (job submission, remote debugging) BC connector for Power BI, Tableau, Qlik, SAP, Excel, ark on Azure HDInsight mized for experimentation and development , python, automatic data visualizations) , SAP, Excel, etc
  • 3.
    ake Spark Simple- Integrated with Azure osystem crosoft R Server - Multi-threaded math libraries and transparent parallelization in R Server means handling up to 1000x more d x faster speeds than open source R. This is based on open source R, it does not require any change to R scripts ure Data Lake Store – HDFS for the cloud, optimized for massive throughput, Ultra ure Data Factory orchestrates Spark ETL pipeline werBI connector for Spark for rich visualization.Newin Power BI is a streaming connector allowing you to publish realwerBI connector for Spark for rich visualization.Newin Power BI is a streaming connector allowing you to publish real eaming directly to Power BI. entsHub connector as a data source for Spark streaming ure SQL Datawarehouse & Hbase connector for fast & scalable storage Integrated with Azure threaded math libraries and transparent parallelization in R Server means handling up to 1000x more data a x faster speeds than open source R. This is based on open source R, it does not require any change to R scripts HDFS for the cloud, optimized for massive throughput, Ultra-high capacity, Low Latency, Secure ACL support in Power BI is a streaming connector allowing you to publish real-time events frin Power BI is a streaming connector allowing you to publish real-time events fr connector for fast & scalable storage
  • 4.
    pyter-Spark Integration viaLivy parkmagic is an open source library that Microsoft is incubating under the housands of Spark clusters in production providing feedback to further improve the experience https://github.com/jupyter Spark Integration via Livy is an open source library that Microsoft is incubating under the Jupyter Incubator progra housands of Spark clusters in production providing feedback to further improve the experience https://github.com/jupyter-incubator/sparkmagic
  • 5.
    ark Execution Model EachSpark Application is an instance of SparkCont gets its own executor processes that has applicatio lifetime Spark is agnostic of Cluster manager as long it has executor process that can communicate with each The driver program must listen for and accept inco connections from its executors throughout its lifet Driver is responsible for scheduling tasks on the clu
  • 6.
    hy Yarn asCluster Manager? soft, Cloudera, Hortonworks, IBM and many other are all actively working to allows you to dynamically share and centrally co een all frameworks that run on YARN. is the only cluster manager for Spark that suppo rized Hadoop clusters and uses secure authentication between its processes allows us to have richer resource management policy ws to maximize cluster utilization, fair resource sharing, dynamic pre-emption when running multiple concurrent application ent resource guarantees for Batch and Interactive workload. ://blog.cloudera.com/blog/2014/05/apache-spark-resource hy Yarn as Cluster Manager? soft, Cloudera, Hortonworks, IBM and many other are all actively working to impove YARN ntrally configure the same pool of cluster resource at supports security. With YARN, Spark can run aga Hadoop clusters and uses secure authentication between its processes. allows us to have richer resource management policy emption when running multiple concurrent application and also able to resource-management-and-yarn-app-models/
  • 7.
    RN AllocationModel forSpark og.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/ SparkSubmit starts and talks to the ResourceManager for the cluster The ResourceManager makes a single container request on behalf of the SparkSubmit The ApplicationMaster starts running within that container. The ApplicationMaster requests subsequent Model for Spark The ApplicationMaster requests subsequent containers for the Spark Executors from the ResourceManager are allocated to run tasks for the application. For Spark Batch Applications, all the Spark executor containers and Application master are freed For Spark interactive Applications (Dynamic Executor enabled), Spark executors are freed after idle timeout but Application master remains till Spark driver exits.
  • 8.
    nning Spark onYARN in HDInsight Requirements Maximize cluster utilization i.e. reduce idle resource Fair resource sharing between different Spark applicationsapplications Resource guarantee nning Spark on YARN in HDInsight Maximize cluster utilization i.e. reduce idle resource Fair resource sharing between different Spark
  • 9.
    aximize cluster utilization educeallocating idle resource pplication should be able to use the entire cluster if necessary hould be able to work with cluster scaling What should be the ideal setting for the n pplication park static allocationpark static allocation spark.executor.instances to a large value park dynamic allocation spark.dynamicAllocation.enabled = true spark.dynamicAllocation.maxExecutors ARN capacity scheduler queue yarn.scheduler.capacity.<parent queue>.<child queue>.maximum to 100 aximize cluster utilization pplication should be able to use the entire cluster if necessary hould be able to work with cluster scaling for the number of executors for any Spark to a large value = true spark.dynamicAllocation.maxExecutors to a large value .<parent queue>.<child queue>.maximum-capac
  • 10.
    ir resource sharing oncurrentapplications should be able to share resources Use separate YARN capacity schedu ontexts Queues are statically created Allocated resources are not shared between different Spark contextsAllocated resources are not shared between different Spark contexts Need a way to reclaim allocated resources when another Spark context comes along YARN preemption AND Spark dynamic allocation Spark dynamic allocation gives up only idle resource YARN preemption to reclaim in-use resource (yarn.resourcemanager.scheduler.monitor.enable yarn.resourcemanager.scheduler.monitor.policies YARN preemption predictable with yarn.scheduler.capacity.resource DefaultResourceCalculator YARN JIRA YARN-4390 oncurrent applications should be able to share resources ty scheduler queues for different Spa Allocated resources are not shared between different Spark contextsAllocated resources are not shared between different Spark contexts Need a way to reclaim allocated resources when another Spark context comes along YARN preemption AND Spark dynamic allocation Spark dynamic allocation gives up only idle resource use resource yarn.resourcemanager.scheduler.monitor.enable & yarn.resourcemanager.scheduler.monitor.policies) yarn.scheduler.capacity.resource-calculator =
  • 11.
    ir resource sharing Useseparate Spark resource pools for same Spark context Resource pools are dynamically created per context Allocated resources are shared between different Spark jobs No need to reclaim allocated resources when another Spark job comes along Combination of the above to support concurrently runningCombination of the above to support concurrently running Notebooks, Batch and BI workloads in the same cluster Use separate Spark resource pools for same Spark context Resource pools are dynamically created per context Allocated resources are shared between different Spark jobs No need to reclaim allocated resources when another Spark job comes along Combination of the above to support concurrently runningCombination of the above to support concurrently running Notebooks, Batch and BI workloads in the same cluster
  • 12.
    source guarantee very sparkapplication should be able to run immediately ombination Separate YARN capacity queues with yarn.scheduler.capacity queue>.capacity used to guarantee resources for different Spark applications Separate Spark resource pools within the same Spark applicationSeparate Spark resource pools within the same Spark application YARN preemption to ensure that in-use resource can be reclaimed Spark dynamic allocation to ensure that idle resources can be reclaimed very spark application should be able to run immediately yarn.scheduler.capacity.<parent queue>.<child used to guarantee resources for different Spark applications Separate Spark resource pools within the same Spark applicationSeparate Spark resource pools within the same Spark application use resource can be reclaimed Spark dynamic allocation to ensure that idle resources can be reclaimed
  • 13.
    orking configuration park settings Spark.executor.instances= <very large value> Spark.dynamicAllocation.enabled = true Spark.dynamicAllocation.initialExecutors = 0 Spark.dynamicAllocation.minExecutors = 0 Spark.dynamicAllocation.maxExecutors = <very large value>Spark.dynamicAllocation.maxExecutors = <very large value> ARN settings yarn.resourcemanager.scheduler.monitor.enable yarn.resourcemanager.scheduler.monitor.policies org.apache.hadoop.yarn.server.resourcemanager. olicy yarn.scheduler.capacity.resource- calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator yarn.scheduler.capacity.root.queues=default,<n queues> Yarn.scheduler.capacity.<parent_queue>.<child_queue Yarn.scheduler.capacity.<parent_queue>.<child_queue OR = <very large value>= <very large value> yarn.resourcemanager.scheduler.monitor.enable = true yarn.resourcemanager.scheduler.monitor.policies = manager.monitor.capacity.ProportionalCapacityPreemp org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator =default,<n queues> child_queue>.capacity child_queue>.maximum_capacity
  • 14.