Spark Tuning for Enterprise System Administrators

Spark Tuning for Enterprise
System Administrators
Anya T. Bida, PhD
Rachel B. Warren

Don't worry about missing something...
Video: https://www.youtube.com/watch?v=DNWaMR8uKDc&feature=youtu.be
Presentation: http://www.slideshare.net/anyabida
Cheat-sheet: http://techsuppdiva.github.io/
!
!
Anya: https://www.linkedin.com/in/anyabida
Rachel: https://www.linkedin.com/in/rachelbwarren
!
!
  !2

About Anya About Rachel
Operations Engineer
!
!
!
Spark & Scala Enthusiast /
Data Engineer
Alpine Data
!
alpinenow.com

About You*
Intermittent
Reliable
Optimal
Spark practitioners
mySparkApp Success
*

Intermittent
Reliable
Optimal
mySparkApp Success

Default != Recommended
Example: By default, spark.executor.memory = 1g
1g allows small jobs to finish out of the box.
Spark assumes you'll increase this parameter. 
!6

Which parameters are important?
!
How do I configure them?
!7
Default != Recommended

Filter* data
before an
expensive reduce
or aggregation
consider*
coalesce(
Use* data
structures that
require less
memory
Serialize*
PySpark
serializing
is built-in
Scala/
Java?
persist(storageLevel.[*]_SER)
Recommended:
kryoserializer *
tuning.html#tuning-
data-structures
See "Optimize partitions."
*
See "GC investigation." *
See "Checkpointing." *
The Spark Tuning Cheat-Sheet

Intermittent
Reliable
Optimal
mySparkApp Success
Memory trouble
Initial conﬁg

!13
How many in the
audience have their own
cluster?

Fair Schedulers
!15
YARN
<allocations>
<queue name="sample_queue">
<minResources>4000 mb,0vcores</minResources>
<maxResources>8000 mb,8vcores</maxResources>
<maxRunningApps>10</maxRunningApps>
<weight>2.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
</queue>
</allocations>
SPARK
<allocations> 
<pool name="sample_queue">
<schedulingMode>FAIR</sch
<weight>1</weight> 
<minShare>2</minShare> 
</pool> 
</allocations>

Fair Schedulers
!16
YARN
<allocations>
</queue>
</allocations>
SPARK
<allocations> 
</pool> 
</allocations>

Fair Schedulers
!17
YARN
<allocations>
</queue>
</allocations>
SPARK
<allocations> 
</pool> 
</allocations>

Fair Schedulers
!18
YARN
<allocations>
</queue>
</allocations>
SPARK
<allocations> 
</pool> 
</allocations>

Fair Schedulers
!19
YARN
<allocations>
</queue>
</allocations>
SPARK
<allocations> 
</pool> 
</allocations>
Use these parameters!

Fair Schedulers
!20
YARN
<allocations>
<user name="sample_user">
</user>
<userMaxAppsDefault>5</userMaxAppsDefault>
!
</allocations>

Fair Schedulers
!21
YARN
<allocations>
<user name="sample_user">
</user>
<userMaxAppsDefault>5</userMaxAppsDefault>
!
</allocations>

What is the memory limit for
mySparkApp?
!22

!23
Max Memory in "pool" x 3/4 = mySparkApp_mem_limit
!
!
!
mySparkApp?

!24
!
!
!
mySparkApp?

!25
!
!
!
<maxResources>___mb</maxResources>
Limitation
mySparkApp?

mySparkApp?
!26
!
!
!
Reserve 25% for overhead

!27
!
!
!
mySparkApp?

!29
!
mySparkApp_mem_limit = driver.memory + (executor.memory
x dynamicAllocation.maxExecutors)
mySparkApp?

!30
!
mySparkApp?

!31
!
mySparkApp?
Limitation: Driver must not be
larger than a single node.

!32
!
mySparkApp?

!33
!
!
!
!
!
!
!
!

!34
!
!
!
!
!
!
!
!
Parameter Default Recommended
spark.executor.cores 1(Yarn mode) 5 or less

!35
!
!
executors per node= (cores per node) / (5cores per executor)
!
!
!
!
!

!36
!
!
!
executor.memory = (memory per node) / (executors per node)
!
!
!

!37
!
!
!
executor.memory = (memory per node) / (executors per node)
!
maxExecutors=(executors per node) x (num nodes)
!

!38
!
mySparkApp?

!39
!
mySparkApp?
Verify my calculations respect this
limitation.

Reduce the memory needed for
mySparkApp. How?
Gracefully handle memory
limitations. How?
mySparkApp memory issues

mySparkApp. How?
limitations. How?
here let's talk about one scenario

mySparkApp. How?
limitations. How?

mySparkApp. How?
limitations. How?
Recommended: kryoserializer *

mySparkApp. How?
limitations. How?
Recommended: kryoserializer *
Spark SQL's
Optimizer

Symptoms:
!55
• mySparkApp is running for several hours
Container is lost.
• Several Executors are lost.

Symptoms:
!56
• mySparkApp is running for several hours
Container is lost.
• Several Executors are lost.
• Behavior is intermittent (sometimes succeeds,
sometimes fails).

Potential Solution: RDD.checkpoint()
!57

!58
Use in these cases:
!
!
Function:
How-to:
!
!

!59
Use in these cases:
!
!
Function:
• saves the RDD to stable
storage (eg hdfs or S3)
How-to:
!
!

!60
Use in these cases:
!
Function:
How-to:
Cache first!
SparkContext.setCheckpointDir(directory: String)
RDD.checkpoint()

!61
Use in these cases:
• high-traffic cluster
• network blips
• preemption
• disk space nearly full
!
!
Function:
How-to:
Cache first!
SparkContext.setCheckpointDir(directory: String)
RDD.checkpoint()

Intermittent
Reliable
Optimal
mySparkApp Success
Memory trouble
Initial conﬁg
Instead of 2.5 hours, myApp
completes in 1 hour.

Cheat-sheet
techsuppdiva.github.io/

Intermittent
Reliable
Optimal
mySparkApp Success
Memory trouble
Initial conﬁg
HighPerformanceSpark.com

Further Reading:
• Spark Tuning Cheat-sheet 
techsuppdiva.github.io
• Apache Spark Documentation 
https://spark.apache.org/docs/latest 
• Checkpointing 
http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing 
https://github.com/jaceklaskowski/mastering-apache-spark-book/blob/master/spark-rdd-checkpointing.adoc 
• Learning Spark, by H. Karau, A. Konwinski, P. Wendell, M. Zaharia, 2015
!66

More Questions?
!67
Video: https://www.youtube.com/watch?v=DNWaMR8uKDc&feature=youtu.be
Presentation: http://www.slideshare.net/anyabida
Cheat-sheet: http://techsuppdiva.github.io/
!
!
Anya: https://www.linkedin.com/in/anyabida
Rachel: https://www.linkedin.com/in/rachelbwarren
!
!
  Thanks!

"Checkpointing"*
Checkpoint* reliably using
RDD.checkpoint()
Need better Driver
failure recovery?*
Metadata Checkpoint*
Using stateful
transformations?*
RDD Checkpoint*
SPARK-9947 Separate Metadata and State Checkpoint

Spark Tuning for Enterprise System Administrators

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Spark Tuning for Enterprise System Administrators

Similar to Spark Tuning for Enterprise System Administrators (20)

Recently uploaded

Recently uploaded (20)

Spark Tuning for Enterprise System Administrators