Getting Apache Spark Customers to Production

1© Cloudera, Inc. All rights reserved.
Getting Spark Customers to Production
Kostas Sakellis

Me
• Software Engineer at Cloudera
• Contributor to Apache Spark
• Before that, contributed to Cloudera Manager

Our customers
• Various degrees of sophistication with Spark
• In all stages of development
• From POC to production deployments
• 95% use Spark on YARN*
• Biweekly analysis of tickets

WARING: This is biased!

Building a proof of
concept!
Courtesy of: http://www.nefloridadesign.com/mbimages/6.jpg

“Why is my job failing?”

“Why is my job slow?”

Misconfiguration
accounts for 20% of
job failures
Courtesy of: http://blog.sdrock.com/pastors/files/2013/06/time-clock.jpg

Resource Declaration
• Not easy knowing what you need and how to specify it
• Compute:
• --num-executors vs. --num-cores
• Memory
• --executor-memory
• Includes JVM overhead
• Need to do the math yourself

Dynamic Allocation
• Let Spark do the work for you
• Available since Spark 1.2*
• No need to specify compute a priori
• Limitation: Still required to specify cores
• In future:
• Allow specification of “task size”
• Dynamically allocate cores

YARN Configuration mismatch
• Compute:
• yarn.nodemanager.resource.cpu-vcores
• yarn.scheduler.maximum-allocation.vcores
• Memory:
• yarn.nodemanager.resource.memory-mb
• yarn.scheduler.maximum-allocation-mb

YARN Configuration mismatch
• Common to ask for more resources than allowed
• Future work:
• Exposing relevant YARN configurations in Spark UI
• Requires changes to YARN itself

Container
[pid=63375,containerID=container_1388158490598_0001_01_00
0003] is running beyond physical memory limits. Current
usage: 2.1 GB of 2 GB physical memory used; 2.8 GB of 4.2
GB virtual memory used. Killing container.
[...]
Another YARN goodie…

yarn.nodemanager.resource.memory-mb
Executor Container
spark.yarn.executor.memoryOverhead (7%) (10% in 1.4)
spark.executor.memory
spark.shuffle.memoryFraction (0.4) spark.storage.memoryFraction (0.6)
Memory allocation

YARN Overhead
• Future work:
• Better understanding of off heap allocations
• Improve memory usage visibility

Run program
through all our
data
Courtesy of:https://conniehallscott.files.wordpress.com/2013/01/411748_538971446114753_1125606225_o.jpg

Data dependent tuning
• As data rates change, re-tuning Spark is usually necessary
• Spark is sensitive to shuffle spills
• The most common knob we modify is…

Partitions, Partitions, Partitions!

GC Stalls

Partitions
• Smaller is often better
• Parameterized partition size
• reduceByKey(…, nPartitions)
• Parameterize application
• Future work:
• Dynamically determine # of partitions (SPARK-4630)

But for now?
• Easy answer:
• Keep multiplying by 1.5 and see what works
• Harder answer:

Shuffle less!

Shuffles
Wide DependencyNarrow Dependencies

ReduceByKey when Possible
•ReduceByKey allows a map-side-combine
parsed
.map{line =>(line.level, 1)}
.reduceByKey{(a, b) => a + b}
.collect()
•GroupByKey transfers all the data
parsed
.map{line =>(line.level, 1)}
.groupByKey.map{case(word,counts) =>
(word,counts.sum)}
.collect()

ReduceByKey when Possible
•ReduceByKey
•GroupByKey

Security, now it’s
getting serious.
Courtesy of: https://www.iti.illinois.edu/sites/default/files/Cybersecurity_image.jpg

Authentication
• Kerberos – the necessary evil
• Ubiquitous amongst other services
• YARN, HDFS, Hive, HBase, etc.
• Spark utilizes delegation tokens

Encryption
• Control plane
• File distribution
• Block Manager
• User UI / REST API
• Data-at-rest (shuffle files)
SPARK-6028 (Replace with netty)
Replace with netty
Spark 1.4
SPARK-2750 (SSL)
SPARK-5682

Authorization
• Enterprises have sensitive data
• Beyond HDFS file permissions
• Partial access to data
• Column level granularity
• Apache Sentry
• HDFS-Sentry synchronization plugin

Customers often
have shared
infrastructure
Courtesy of: https://radioglobalistic.files.wordpress.com/2011/02/lagos-traffic.jpg

Multi-tenancy
• Cluster utilization is top metric
• Target: 70-80% utilization
• Mixed workloads from mixed customers
• We recommend YARN
• Built in resource manager

Underutilized
Clusters
Courtesy of: http://media.nbclosangeles.com/images/1200*675/60-freeway-repair-dec16-2-empty.JPG

Dynamic Allocation
• Allows jobs to scale to size according to load
• Knobs to control min, max and initial size
• Future Work:
• Target: Dynamic allocation enabled by default
• Data locality & Caching
• Open question with Streaming

Thank you
We’re Hiring!

Getting Apache Spark Customers to Production

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Getting Apache Spark Customers to Production

Similar to Getting Apache Spark Customers to Production (20)

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Recently uploaded

Recently uploaded (20)

Getting Apache Spark Customers to Production

Editor's Notes