Building Robust, Adaptive Streaming Apps with Spark Streaming

Building Robust, Adaptive Streaming
Apps with Spark Streaming
Tathagata “TD” Das
@tathadas
Spark Summit East 2016

Who am I?
Project ManagementCommittee (PMC) member of Spark
Started Spark Streaming in grad school - AMPLab, UC Berkeley
Currenttechnical lead of Spark Streaming
Software engineerat Databricks
2

Streaming Apps in the Real World
Building a high volume stream processingsystem in
production has many challenges
3
Fast and Scalable Spark Streaming is fast, distributed and
scalableby design!
Running in production at many places
with largeclusters and high data volumes

4
Fast and Scalable
Easy to program
Spark Streaming makes it easy to express
complex streaming businesslogic
Interoperateswith Spark RDDs, Spark SQL
DataFrames/Datasets and MLlib

5
Fast and Scalable
Easy to program
Fault-tolerant
Spark Streaming is fullyfault-tolerant
and can provide end-to-end semantic
guarantees
See my previous Spark Summit talk for moredetails

6
Fast and Scalable
Easy to program
Fault-tolerant
Adaptive Focus of this talk

Adaptive Streaming Apps
Processing conditions can changedynamically
Sudden surges in data rates
Diurnal variationsin processing load
Unexpected slowdownsin downstreamdata stores
Streaming apps should be able to adapt accordingly
7

8
Backpressure
Make apps robust
againstdata surges
Elastic Scaling
Make apps scale with
load variations

9
Backpressure
Make apps robust
againstdata surges

Motivation
Stability condition for any streaming app
Receive data only as fast as the system can processit
Stability condition for Spark Streaming’s “micro-batch” model
Finish processing previous batch before next one arrives
10

Stable micro-batch operation
11
batch interval
1s 1s 1s 1s
batch processing time <=
Previousbatch isprocessed before next one arrives => stable
Spark Streaming runs micro-batchesat fixed batchintervals
time
0s 2s 4s 6s 8s

Unstable micro-batch operation
12
batch processing time > batch interval
2.1s 2.1s 2.1s 2.1s
scheduling delay builds up
Batchescontinuously getsdelayed and backlogged => unstable
time
Spark Streaming runs micro-batchesat fixed batchintervals
0s 2s 4s 6s 8s

Backpressure: Feedback Loop
13
Backpressureintroducesa feedback loop to dynamically
adapt the system and avoid instability

Backpressure: Dynamic rate limiting
14
Batch processingtimes
Scheduling delays
Batch processingtimes and schedulingdelays used to
continuouslyestimate currentprocessing rates

15
∑
Scheduling delays
Max stable processingrate estimated with PID Controllertheory
Well known feedbackmechanismused in industrial control systems
PID Controller

16
∑
Scheduling delays
Rate limits on
ingestion
Accordingly, the system dynamically adapts
the limits on the data ingestion rates
PID Controller

Data buffered in Kafka, ensures
Spark Streaming stays stable
17
If HDFS ingestion slows down,
processingtimes increase
SS lowers rate limits to slow
down receiving from Kafka

Backpressure: Configuration
Available since Spark 1.5
Enabled through SparkConf,set
spark.streaming.backpressure.enabled=true
18

19
Elastic Scaling
Make apps scale with
load variations

Elastic Scaling (aka Dynamic Allocation)
Scaling the number of Spark executorsaccording to the load
Spark alreadysupports Dynamic Allocation for batch jobs
Scale down if executorsare idle
Scale up if tasks are queueing up
Streaming “micro-batch” jobs needdifferent scaling policy
No executoris idle for a long time!
20

Scaling policy with Streaming
21
Load is very low
Lots idle time between batches
Cluster resources wasted
stable but
inefficient

22
scale down
the cluster
scaling down increases batch processing times
stable operation with less resources wasted
stable but
inefficient
stable and
efficient

23
scaling up reduces batch processing times
unstable system becomes stable again
scale up
the cluster
unstable
stable

Elastic Scaling
24
Data buffered in Kafka starts draining,
allowing app adapt to any data rate
If Kafka gets data faster than
what backpressureallows
SS scalesup clusterto
increaseprocessing rate

Elastic Scaling: Configuration
Will be available in Spark 2.0
Enabled through SparkConf,set
spark.streaming.dynamicAllocation.enabled = true
More parameters will be in the online programming guide
25

Elastic Scaling: Configuration
Make surethere is enoughparallelismto take advantage of
max cluster size
# of partitions in reduce,join, etc.
# of Kafka partitions
# of receivers
Gives usualfault-tolerance guaranteeswith files, Kafka Direct,
Kinesis, and receiver-basedsourceswith WAL enabled
26

27
Backpressure Elastic Scaling

29
Demo
How app behaveswhen
the data rate suddenly
increases20x

Demo
30
Processing time increases
with data rate, until equal to
batch interval
Backpressurelimits the
ingestion rate lower than 20k
recs/secto keep app stable

Demo
31
Elastic Scaling detects heavy
load and increasescluster size
Processing times reducesas
more resourcesavailable

Demo
32
Backpressurerelaxeslimits to
allow higheringestion rate
But still less than 20x as
cluster is fullyutilized

Takeaways
Backpressure:makes apps robust to sudden changes
Elastic Scaling: makes apps adapt to slower changes
Backpressure+ Elastic Scaling =
AwesomeAdaptive Apps
Follow me @tathadas

Building Robust, Adaptive Streaming Apps with Spark Streaming

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building Robust, Adaptive Streaming Apps with Spark Streaming

Similar to Building Robust, Adaptive Streaming Apps with Spark Streaming (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Building Robust, Adaptive Streaming Apps with Spark Streaming