SlideShare a Scribd company logo
이름 김병진, 권오찬
Extending Spark Streaming to
Support Complex Event
Processing
소속 삼성전자
2015. 10. 27.
Agenda
• Background
• Motivation
• Streaming SQL
• Auto Scaling
• Future Work
Background - Spark
• Apache Spark is a high performance data processing platform
• Use a distributed memory cache (RDD*)
• Process batch and stream data in the common platform
• Be developed by 700+ contributors
3x faster than Storm
(Stream, 30 Nodes)
100x faster than Hadoop
(Batch, 50 Nodes, Clustering Alg.)
*RDD: Resilient Distributed Datasets
Background – Spark RDD
• Resilient Distributed Datasets
• RDDs are immutable
• Transformations are lazy operations which build a RDD’s lineage graph
• E.g.: map, filter, join, union
• Actions launch a computation to return a value or write data
• E.g.: count, collect, reduce, save
• The scheduler builds a DAG of stages to execute
• If a task fails, spark re-runs it on another node as long as its stage’s parent are still
available
Background - Spark Modules
• Spark Streaming
• Divide stream data into micro-batches
• Compute the batches with fault tolerance
• Spark SQL
• Support SQL to handle structured data
• Provide a common way to access a variety of data
sources (Hive, Avro, Parquet, ORC, JSON, and JDBC)
Spark
Spark
Streaming
batches of
X seconds
live data stream
processed
results
DStreams
RDDs
map, reduce, count, …
Motivation – Support CEP in Spark
• Current spark is not enough to support CEP*
• Do not support continuous query language to process stream data
• Do not support auto-scaling to elastically allocate resources
• We have solved the issues:
• Extend Intel’s Streaming SQL package
• Improve performance by optimizing time-based windowed aggregation
• Support query chains by implementing “Insert Into” queries
• Implement elastic-seamless resource allocation
• Can do auto-scale in/out
*Complex Event Processing
Streaming SQL
Streaming SQL - Intel
• Streaming SQL is a third party library of Spark Packages
• Build on top of Spark Streaming and Spark SQL Catalyst
• Manipulate stream data like static structured data in database
• Process queries continuously
• Support time based windowing join/aggregation queries
http://spark-packages.org/package/Intel-bigdata/spark-streamingsql
Streaming
SQL
Streaming
SQL Query
Schema
DStream
Optimized
Logical
Plan
Streaming SQL - Plans
• Logical Plan
• Modify streaming SQL optimizer
• Add windowed aggregate logical plan
• Physical Plan
• Add windowed aggregate physical plan to call new WindowedStateDStream class
• Implement new expression functions (Count, Sum, Average, Min, Max, Distinct)
• Develop FixedSizedAggregator for efficiency which is the concept of IBM InfoSphere Streams
Samsung
Stream
Plan
Intel
Logical
Plan
Physical
Plan
Streaming SQL – Windowed State
• WindowedStateDStream class
• Modify StateDStream class to support windowed computing
• Add inverse update function to evaluate old values
• Add filter function to remove obsolete keys
Intel Samsung
Compute all elements in the window
time 1 time 2 time 3 time 4 time 5
Original
DStream
Windowed
State
DStream
time 1 time 2 time 3 time 4 time 5
Original
DStream
Windowed
DStream
Compute only Delta (old and new elements)
window-based
operation Update (in)
InvUpdate (out)
State: Array[AggregateFunction]
Count
Update
InvUpdate
Update
InvUpdate
State
Sum
Update
InvUpdate
State
Avg
Update
InvUpdate
State
Min
Update
InvUpdate
State
Max
Update
InvUpdate
State
Streaming SQL – Fixed Sized Aggregator
• Fixed-Sized Aggregator* of IBM InfoSphere Streams
• Use fixed sized binary tree
• Maintain leaf nodes as a circular buffer using front and back pointers
• Very efficient for non-invertable expressions (e.g. Min, Max)
• Example - Min Aggregator
4 7 3
4 3
3
4 7 3 2
4 2
2
7 3 2
7 2
2
9 7 3 2
7 2
2
4
4
4
4 7
4
4
*K. Tangwongsan, M. Hirzel, S. Schneider, K-L. Wu, “General incremental sliding-window aggregation,” In VLDB, 2015.
Streaming SQL – Aggregate Functions
• Windowed Aggregate Functions
• Add invUpdate function
• Reduce objects for efficient serialization
• Implement Fixed-Sized Aggregator for Min, Max functions
Aggregate Functions State Update InvUpdate
Count countValue: Long Increase countValue Decrease countValue
Sum sumValue: Any Add to sumValue Subtract to sumValue
Average sumValue: Any
countValue: Long
Increase countValue
Add to sumValue
Decrease countValue
Subtract to sumValue
Min fat: FixedSizedAggregator Insert minimum to fat Remove the oldest from fat
Max fat: FixedSizedAggregator Insert maximum to fat Remove the oldest from fat
CountDistinct distinctMap:
mutable.HashMap[Row, Long]
Increase count of distinct value
in map
Decrease count of distinct value
in map
Streaming SQL – Insert Into
• Support “Insert Into” query
• Implement Data Sources API
• Implement the insert function in Kafka relation
• Modified some physical plans to support streaming
• Modified physical planning strategies to assign a specific plan
• Convert current RDD to a DataFrame and then insert it
• Support query chaining
 The result a query can be reused by multiple queries
Example2
Kafka Kafka
INSERT INTO TABLE
Example1
SELECT duid, time
FROM ParsedTable
INSERT INTO TABLE Example2
SELECT duid, COUNT (*)
FROM Example1
GROUP BY duidParsed
Table
Example1
Kafka
Kafka Relation InsertIntoStreamSource
StreamExecutedCommandInsertIntoTable
Kafka
Logical Plan Physical Plan
Kafka Relation
DataFrame
Streaming SQL – Evaluation
• Experimental Environment
• Processing Node: Intel i5 2.67GHz, 4 Cores, 4GB per node
Spark Cluster: 7 Nodes
Kafka Cluster: 3 Nodes
• Test Query:
SELECT t.word, COUNT(t.word), SUM(t.num), AVG(t.num), MIN(t.num), MAX(t.num)
FROM (SELECT * FROM t_kafka) OVER (WINDOW 'x' SECONDS, SLIDE ‘1' SECONDS) AS t
GROUP BY t.word
• Default Set:
100 EPS, 100 Keys, 20 sec Window, 1 sec Slide, 10 Executors, 10 Reducers
Spark Cluster
Test
Agent
Kafka Cluster
Spark UI
Streaming SQL – Evaluation
• Test Result
• Show low processing delays despite of heavy event loads or large-sized windows
• Need memory optimization
0
200
400
600
800
1000
1200
1400
20 40 60 80 100
Delay(ms)
Window Size (sec)
Intel Samsung
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
20 40 60 80 100
Memory(KB)
Window Size (sec)
Intel Samsung
0
1000
2000
3000
4000
5000
10 100 1000 10000
Delay(ms)
EPS
Intel Samsung
0
2000
4000
6000
8000
10000
12000
10 100 1000 10000
Memory(KB)
EPS
Intel Samsung
Auto Scaling
Time-varying Event Rate in Real World
• Spark Streaming
• Data can be ingested from many sources like Kafka, Flume, Twitter, etc.
• Live input data streams are divided into batches, and they are processed
• In streaming application, event rate may change frequently over time
• How can this be dealt with?
< Event Rate > < Batch Processing Time>
High event rate Can’t be processed in real-time
Spark AS-IS
• Spark currently supports dynamic resource allocation on Yarn
(SPARK-3174) and coarse-grained Mesos (SPARK-6287)
• Existing Dynamic Resource Allocation is not optimized for streaming
• Even though event rate becomes smaller, executors may not be removed due to
scheduling policy
• It is difficult to determine appropriate configuration parameters
• Backpressure (SPARK-7398)
• Enable the Spark streaming to control the receiving rate dynamically for handling bursty input
streams
• Not real-time
Driver
Task queue Task
Executor #1
...
Executor #2
Executor #n
• Upper/lower bound for the number of executors
• spark.dynamicAllocation.minExecutors
• spark.dynamicAllocation.maxExecutors
• Scale-out condition
• schedulerBacklogTimeout < staying time of a task in task queue
• Scale-in condition
• executorIdleTimeout < staying time of an executor in idle state
• Goal
• Allocate resources to streaming applications dynamically as the rate of incoming events varies
over time
• Enable the applications to meet real-time deadlines
• Utilize the resource efficiently
• Cloud Architecture
Elastic-seamless Resource Allocation
Flint*
*Flint: our spark job manager
• Spark currently supports three cluster managers
• Standalone, Apache Mesos, Hadoop Yarn
• Spark on Mesos
• Fine-grained mode
• Each Spark task runs as a separate Mesos task.
• Launching overhead is big, so it is not suitable for streaming.
• Coarse-grained mode
• Launch only one long-running Spark task on each Mesos machine
• Cannot scale-out more than the number of Mesos machines
• Cannot control executor resource
• Only available for total resource
• Both of two modes have some problems to achieve our goal
Spark Deployment Architecture
Flint Architecture Overview
Slave 1 Slave 2 Slave n
. . .
MESOS Cluster
Cluster 1
YARN
Cluster n
YARN
. . .
. . .
NM 1
NM 2
. . .
NM n NM 1
NM 2
NM n
. . .
App 1
SPARK
App n
SPARK
EXEC 1
EXEC 2
EXEC n. . . EXEC 1
EXEC 2
EXEC n. . .
FLINT
REST
Manager
Scheduler
Application
Manager
Application
Handler
Application
Handler
Application
Handler
Deploy
Manager
Log
Manager
• Marathon, Zookeeper, ETCD, HDFS
Spark on on-demand Yarn
• Job submission process
1. Request to deploy dockerized YARN via Marathon
2. Launch ResourceManager, NodeManagers for Spark driver/executors
3. Check ResourceManager status
4. Submit Spark Job and check Job status
5. Watch Spark driver status and get Spark endpoint
Flint Architecture: Job Submission
Slave 1 Slave n
. . .
MESOS Cluster
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
NM 2
. . .
NM n
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
NM 2
. . .
NM n
App
SPARK
EXEC 1
EXEC 2
EXEC n. . .
Create Yarn Cluster
Submit Spark Job
• Scale-out Process
1. Request to increase the instance of NodeManager via Marathon
2. Launch new NodeManager
3. Scale-out Spark executor
Flint Architecture: Scale-out
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
App
SPARK
EXEC 1 EXEC n. . .
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
App
SPARK
EXEC 1 EXEC n. . .
NM 2
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
NM 2
App
SPARK
EXEC 1 EXEC n. . .
EXEC 2
Request new NodeManager Scale-out Spark executor
• Scale-in Process
1. Get Executor’s info and select spark victims
2. Inactivate spark victims and kill them after n x batch interval
3. Get Yarn victims and decommission NodeManager via ResourceManager
4. Get mesos task id of Yarn victims and kill mesos tasks of victims
Flint Architecture: Scale-in
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
NM 2
App
SPARK
EXEC 1 EXEC n. . .
EXEC 2
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
NM 2
App
SPARK
EXEC 1 EXEC n. . .
Slave 1 Slave n
. . .
MESOS Cluster
Cluster
YARN
NM 1
. . .
NM n
NM 2
App
SPARK
EXEC 1 EXEC n. . .
Scale-in Spark executor Remove NodeManager
• Auto-scaling Mechanism
• Real-time constraint
• Batch processing time < Batch interval
• Scale-out condition
• α x batch interval < batch processing delay
• Scale-in condition
• β x batch interval > batch processing delay
Flint Architecture: Auto-scaling
( 0 < β < α ≤ 1 )
Processing Time
Batch Interval
α x Batch Intervalβ x Batch Interval
Scale-in Scale-out
0 Time
• Timeout & Retry for Killed Executor
• Although Spark executor is killed by admin, Spark re-tries to connect it.
Spark Issues: Scale-in (1/3)
Scale-in
Scheduling Delay
Event Rate
Processing Delay
Driver Exec*
scale-in
Exec Task
RDD req
1st try
2nd try
3rd try
fail
Processing
Delay
• spark.shuffle.io.maxRetries = 3
• spark.shuffle.io.retryWait = 5s
• Timeout & Retry for Killed Executor
• Advertisement for killed executor
• Inactivate executor before scale-in
• Inactivate executor and kill them after n x batch interval
• During inactivating stage, the executor does not receive any task from driver.
Spark Issues: Scale-in (2/3)
Driver Exec*
scale-in
Exec Task
RDD req
fail
advertise
Fast
Fail
• Data Locality
• Spark scheduler consider data locality
• Processing/Node/Rack locality
• spark.locality.wait = 3s
• However, waiting time for locality is big burden to streaming applications
• The waiting time should be much less than batch interval for streaming
• Otherwise, streaming may not be processed in real-time
• Thus, Flint overrides “spark.locality.wait” to very small value if application type is streaming
Spark Issues: Scale-in (3/3)
Scheduler Cluster
TTT T …
3s10ms
(max)
rsc alloc
T
• Data Localization
• During scale-out process, new Yarn container performs to localize some data
(e.g. spark jar, application jar)
• Data localization incurs high disk I/O, so
• Solution
• Prefetch if possible
• Disk isolation, SSD
Spark Issues: Scale-out (1/2)
Existing
App
New
App
Disk
Access disk during localization
(50-70 MB/sec)
Performance
Degradation
• RDD Replication
• When a new executor is added, a receiver requests to replicate received RDD blocks to the
executor
• New executor does not ready to receive RDD blocks during an initialization
• At this situation, the receiver waits until new executor is ready
• Solution
• A receiver does not replicate RDD blocks to new executor during initialization
• E.g., spark.blockmanager.peer.bootstrap = 10s
Spark Issues: Scale-out (2/2)
Kafka
R
RT
R R
Executor R
R
T
R
Executor 1
R
T
R
Executor 2
RT
T
Receiver Task
Task
R RDDReplicate factor = 2
• Demo Environment
• Data source: Kafka
• Auto-scaling parameter
• α=0.4, β=0.9
• SQL query
• SELECT t.word, COUNT(DISTINCT t.num), SUM(t.num), AVG(t.num), MIN(t.num), MAX(t.num) FROM
(SELECT * FROM t_kafka) OVER (WINDOW 300 SECONDS, SLIDE 3 SECONDS) AS t GROUP BY t.word
Demo (1/2)
Task
Task
Producer
Rcvr
8 3 4 2
5 7 9 1
0 3 7 8
Task Driver
0: 23
1: 34
2: 41
…
Flint
Monitoring
Scale-in/Out
Demo (2/2)
Future Work
• Streaming SQL
• Apply Tungsten Framework
• Small-sized object, code gen, GC free memory management
• Share RDDs
• Map stream tables to RDD
• Auto Scaling
• Support to dynamically allocate resources for batch (non-streaming) applications
• Support unified schedulers for heterogeneous applications
• Batch, streaming, ad-hoc
THANK YOU!
https://github.com/samsung/spark-cep

More Related Content

What's hot

Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
Evan Chan
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Databricks
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale ML
huguk
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
Jason Hubbard
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance Understandability
Jen Aman
 
Spark on Mesos
Spark on MesosSpark on Mesos
Spark on Mesos
Jen Aman
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni Schiefer
Spark Summit
 
Deploying Accelerators At Datacenter Scale Using Spark
Deploying Accelerators At Datacenter Scale Using SparkDeploying Accelerators At Datacenter Scale Using Spark
Deploying Accelerators At Datacenter Scale Using Spark
Jen Aman
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Landon Robinson
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resources
DataWorks Summit/Hadoop Summit
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Databricks
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
Tsuyoshi OZAWA
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
Rahul Kumar
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
DB Tsai
 
Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0
Shi Shao Feng
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
Spark on YARN
Spark on YARNSpark on YARN
Spark on YARN
Adarsh Pannu
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
Evan Chan
 
Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARN
Tsuyoshi OZAWA
 

What's hot (20)

Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale ML
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance Understandability
 
Spark on Mesos
Spark on MesosSpark on Mesos
Spark on Mesos
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni Schiefer
 
Deploying Accelerators At Datacenter Scale Using Spark
Deploying Accelerators At Datacenter Scale Using SparkDeploying Accelerators At Datacenter Scale Using Spark
Deploying Accelerators At Datacenter Scale Using Spark
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Investing the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resourcesInvesting the Effects of Overcommitting YARN resources
Investing the Effects of Overcommitting YARN resources
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
 
Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Spark on YARN
Spark on YARNSpark on YARN
Spark on YARN
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
 
Dynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARNDynamic Resource Allocation Spark on YARN
Dynamic Resource Allocation Spark on YARN
 

Similar to Spark cep

Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
Girish Khanzode
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
Databricks
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
Apache Spark - A High Level overview
Apache Spark - A High Level overviewApache Spark - A High Level overview
Apache Spark - A High Level overview
Karan Alang
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
SolarWinds Loggly
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
amesar0
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Amazon Web Services
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
Girish Khanzode
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Databricks
 
Apache Spark
Apache SparkApache Spark
Apache Spark
masifqadri
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
harendra_pathak
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
Venkata Naga Ravi
 
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
hadooparchbook
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Chris Fregly
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark Summit
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)
Yuval Itzchakov
 
Spark streaming high level overview
Spark streaming high level overviewSpark streaming high level overview
Spark streaming high level overview
Avi Levi
 

Similar to Spark cep (20)

Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Apache Spark - A High Level overview
Apache Spark - A High Level overviewApache Spark - A High Level overview
Apache Spark - A High Level overview
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
 
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)
 
Spark streaming high level overview
Spark streaming high level overviewSpark streaming high level overview
Spark streaming high level overview
 

Recently uploaded

官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
mamamaam477
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
NazakatAliKhoso2
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 

Recently uploaded (20)

官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Engine Lubrication performance System.pdf
Engine Lubrication performance System.pdfEngine Lubrication performance System.pdf
Engine Lubrication performance System.pdf
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 

Spark cep

  • 1. 이름 김병진, 권오찬 Extending Spark Streaming to Support Complex Event Processing 소속 삼성전자 2015. 10. 27.
  • 2. Agenda • Background • Motivation • Streaming SQL • Auto Scaling • Future Work
  • 3. Background - Spark • Apache Spark is a high performance data processing platform • Use a distributed memory cache (RDD*) • Process batch and stream data in the common platform • Be developed by 700+ contributors 3x faster than Storm (Stream, 30 Nodes) 100x faster than Hadoop (Batch, 50 Nodes, Clustering Alg.) *RDD: Resilient Distributed Datasets
  • 4. Background – Spark RDD • Resilient Distributed Datasets • RDDs are immutable • Transformations are lazy operations which build a RDD’s lineage graph • E.g.: map, filter, join, union • Actions launch a computation to return a value or write data • E.g.: count, collect, reduce, save • The scheduler builds a DAG of stages to execute • If a task fails, spark re-runs it on another node as long as its stage’s parent are still available
  • 5. Background - Spark Modules • Spark Streaming • Divide stream data into micro-batches • Compute the batches with fault tolerance • Spark SQL • Support SQL to handle structured data • Provide a common way to access a variety of data sources (Hive, Avro, Parquet, ORC, JSON, and JDBC) Spark Spark Streaming batches of X seconds live data stream processed results DStreams RDDs map, reduce, count, …
  • 6. Motivation – Support CEP in Spark • Current spark is not enough to support CEP* • Do not support continuous query language to process stream data • Do not support auto-scaling to elastically allocate resources • We have solved the issues: • Extend Intel’s Streaming SQL package • Improve performance by optimizing time-based windowed aggregation • Support query chains by implementing “Insert Into” queries • Implement elastic-seamless resource allocation • Can do auto-scale in/out *Complex Event Processing
  • 8. Streaming SQL - Intel • Streaming SQL is a third party library of Spark Packages • Build on top of Spark Streaming and Spark SQL Catalyst • Manipulate stream data like static structured data in database • Process queries continuously • Support time based windowing join/aggregation queries http://spark-packages.org/package/Intel-bigdata/spark-streamingsql Streaming SQL
  • 9. Streaming SQL Query Schema DStream Optimized Logical Plan Streaming SQL - Plans • Logical Plan • Modify streaming SQL optimizer • Add windowed aggregate logical plan • Physical Plan • Add windowed aggregate physical plan to call new WindowedStateDStream class • Implement new expression functions (Count, Sum, Average, Min, Max, Distinct) • Develop FixedSizedAggregator for efficiency which is the concept of IBM InfoSphere Streams Samsung Stream Plan Intel Logical Plan Physical Plan
  • 10. Streaming SQL – Windowed State • WindowedStateDStream class • Modify StateDStream class to support windowed computing • Add inverse update function to evaluate old values • Add filter function to remove obsolete keys Intel Samsung Compute all elements in the window time 1 time 2 time 3 time 4 time 5 Original DStream Windowed State DStream time 1 time 2 time 3 time 4 time 5 Original DStream Windowed DStream Compute only Delta (old and new elements) window-based operation Update (in) InvUpdate (out) State: Array[AggregateFunction] Count Update InvUpdate Update InvUpdate State Sum Update InvUpdate State Avg Update InvUpdate State Min Update InvUpdate State Max Update InvUpdate State
  • 11. Streaming SQL – Fixed Sized Aggregator • Fixed-Sized Aggregator* of IBM InfoSphere Streams • Use fixed sized binary tree • Maintain leaf nodes as a circular buffer using front and back pointers • Very efficient for non-invertable expressions (e.g. Min, Max) • Example - Min Aggregator 4 7 3 4 3 3 4 7 3 2 4 2 2 7 3 2 7 2 2 9 7 3 2 7 2 2 4 4 4 4 7 4 4 *K. Tangwongsan, M. Hirzel, S. Schneider, K-L. Wu, “General incremental sliding-window aggregation,” In VLDB, 2015.
  • 12. Streaming SQL – Aggregate Functions • Windowed Aggregate Functions • Add invUpdate function • Reduce objects for efficient serialization • Implement Fixed-Sized Aggregator for Min, Max functions Aggregate Functions State Update InvUpdate Count countValue: Long Increase countValue Decrease countValue Sum sumValue: Any Add to sumValue Subtract to sumValue Average sumValue: Any countValue: Long Increase countValue Add to sumValue Decrease countValue Subtract to sumValue Min fat: FixedSizedAggregator Insert minimum to fat Remove the oldest from fat Max fat: FixedSizedAggregator Insert maximum to fat Remove the oldest from fat CountDistinct distinctMap: mutable.HashMap[Row, Long] Increase count of distinct value in map Decrease count of distinct value in map
  • 13. Streaming SQL – Insert Into • Support “Insert Into” query • Implement Data Sources API • Implement the insert function in Kafka relation • Modified some physical plans to support streaming • Modified physical planning strategies to assign a specific plan • Convert current RDD to a DataFrame and then insert it • Support query chaining  The result a query can be reused by multiple queries Example2 Kafka Kafka INSERT INTO TABLE Example1 SELECT duid, time FROM ParsedTable INSERT INTO TABLE Example2 SELECT duid, COUNT (*) FROM Example1 GROUP BY duidParsed Table Example1 Kafka Kafka Relation InsertIntoStreamSource StreamExecutedCommandInsertIntoTable Kafka Logical Plan Physical Plan Kafka Relation DataFrame
  • 14. Streaming SQL – Evaluation • Experimental Environment • Processing Node: Intel i5 2.67GHz, 4 Cores, 4GB per node Spark Cluster: 7 Nodes Kafka Cluster: 3 Nodes • Test Query: SELECT t.word, COUNT(t.word), SUM(t.num), AVG(t.num), MIN(t.num), MAX(t.num) FROM (SELECT * FROM t_kafka) OVER (WINDOW 'x' SECONDS, SLIDE ‘1' SECONDS) AS t GROUP BY t.word • Default Set: 100 EPS, 100 Keys, 20 sec Window, 1 sec Slide, 10 Executors, 10 Reducers Spark Cluster Test Agent Kafka Cluster Spark UI
  • 15. Streaming SQL – Evaluation • Test Result • Show low processing delays despite of heavy event loads or large-sized windows • Need memory optimization 0 200 400 600 800 1000 1200 1400 20 40 60 80 100 Delay(ms) Window Size (sec) Intel Samsung 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 20 40 60 80 100 Memory(KB) Window Size (sec) Intel Samsung 0 1000 2000 3000 4000 5000 10 100 1000 10000 Delay(ms) EPS Intel Samsung 0 2000 4000 6000 8000 10000 12000 10 100 1000 10000 Memory(KB) EPS Intel Samsung
  • 17. Time-varying Event Rate in Real World • Spark Streaming • Data can be ingested from many sources like Kafka, Flume, Twitter, etc. • Live input data streams are divided into batches, and they are processed • In streaming application, event rate may change frequently over time • How can this be dealt with? < Event Rate > < Batch Processing Time> High event rate Can’t be processed in real-time
  • 18. Spark AS-IS • Spark currently supports dynamic resource allocation on Yarn (SPARK-3174) and coarse-grained Mesos (SPARK-6287) • Existing Dynamic Resource Allocation is not optimized for streaming • Even though event rate becomes smaller, executors may not be removed due to scheduling policy • It is difficult to determine appropriate configuration parameters • Backpressure (SPARK-7398) • Enable the Spark streaming to control the receiving rate dynamically for handling bursty input streams • Not real-time Driver Task queue Task Executor #1 ... Executor #2 Executor #n • Upper/lower bound for the number of executors • spark.dynamicAllocation.minExecutors • spark.dynamicAllocation.maxExecutors • Scale-out condition • schedulerBacklogTimeout < staying time of a task in task queue • Scale-in condition • executorIdleTimeout < staying time of an executor in idle state
  • 19. • Goal • Allocate resources to streaming applications dynamically as the rate of incoming events varies over time • Enable the applications to meet real-time deadlines • Utilize the resource efficiently • Cloud Architecture Elastic-seamless Resource Allocation Flint* *Flint: our spark job manager
  • 20. • Spark currently supports three cluster managers • Standalone, Apache Mesos, Hadoop Yarn • Spark on Mesos • Fine-grained mode • Each Spark task runs as a separate Mesos task. • Launching overhead is big, so it is not suitable for streaming. • Coarse-grained mode • Launch only one long-running Spark task on each Mesos machine • Cannot scale-out more than the number of Mesos machines • Cannot control executor resource • Only available for total resource • Both of two modes have some problems to achieve our goal Spark Deployment Architecture
  • 21. Flint Architecture Overview Slave 1 Slave 2 Slave n . . . MESOS Cluster Cluster 1 YARN Cluster n YARN . . . . . . NM 1 NM 2 . . . NM n NM 1 NM 2 NM n . . . App 1 SPARK App n SPARK EXEC 1 EXEC 2 EXEC n. . . EXEC 1 EXEC 2 EXEC n. . . FLINT REST Manager Scheduler Application Manager Application Handler Application Handler Application Handler Deploy Manager Log Manager • Marathon, Zookeeper, ETCD, HDFS Spark on on-demand Yarn
  • 22. • Job submission process 1. Request to deploy dockerized YARN via Marathon 2. Launch ResourceManager, NodeManagers for Spark driver/executors 3. Check ResourceManager status 4. Submit Spark Job and check Job status 5. Watch Spark driver status and get Spark endpoint Flint Architecture: Job Submission Slave 1 Slave n . . . MESOS Cluster Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 NM 2 . . . NM n Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 NM 2 . . . NM n App SPARK EXEC 1 EXEC 2 EXEC n. . . Create Yarn Cluster Submit Spark Job
  • 23. • Scale-out Process 1. Request to increase the instance of NodeManager via Marathon 2. Launch new NodeManager 3. Scale-out Spark executor Flint Architecture: Scale-out Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n App SPARK EXEC 1 EXEC n. . . Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n App SPARK EXEC 1 EXEC n. . . NM 2 Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n NM 2 App SPARK EXEC 1 EXEC n. . . EXEC 2 Request new NodeManager Scale-out Spark executor
  • 24. • Scale-in Process 1. Get Executor’s info and select spark victims 2. Inactivate spark victims and kill them after n x batch interval 3. Get Yarn victims and decommission NodeManager via ResourceManager 4. Get mesos task id of Yarn victims and kill mesos tasks of victims Flint Architecture: Scale-in Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n NM 2 App SPARK EXEC 1 EXEC n. . . EXEC 2 Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n NM 2 App SPARK EXEC 1 EXEC n. . . Slave 1 Slave n . . . MESOS Cluster Cluster YARN NM 1 . . . NM n NM 2 App SPARK EXEC 1 EXEC n. . . Scale-in Spark executor Remove NodeManager
  • 25. • Auto-scaling Mechanism • Real-time constraint • Batch processing time < Batch interval • Scale-out condition • α x batch interval < batch processing delay • Scale-in condition • β x batch interval > batch processing delay Flint Architecture: Auto-scaling ( 0 < β < α ≤ 1 ) Processing Time Batch Interval α x Batch Intervalβ x Batch Interval Scale-in Scale-out 0 Time
  • 26. • Timeout & Retry for Killed Executor • Although Spark executor is killed by admin, Spark re-tries to connect it. Spark Issues: Scale-in (1/3) Scale-in Scheduling Delay Event Rate Processing Delay Driver Exec* scale-in Exec Task RDD req 1st try 2nd try 3rd try fail Processing Delay • spark.shuffle.io.maxRetries = 3 • spark.shuffle.io.retryWait = 5s
  • 27. • Timeout & Retry for Killed Executor • Advertisement for killed executor • Inactivate executor before scale-in • Inactivate executor and kill them after n x batch interval • During inactivating stage, the executor does not receive any task from driver. Spark Issues: Scale-in (2/3) Driver Exec* scale-in Exec Task RDD req fail advertise Fast Fail
  • 28. • Data Locality • Spark scheduler consider data locality • Processing/Node/Rack locality • spark.locality.wait = 3s • However, waiting time for locality is big burden to streaming applications • The waiting time should be much less than batch interval for streaming • Otherwise, streaming may not be processed in real-time • Thus, Flint overrides “spark.locality.wait” to very small value if application type is streaming Spark Issues: Scale-in (3/3) Scheduler Cluster TTT T … 3s10ms (max) rsc alloc T
  • 29. • Data Localization • During scale-out process, new Yarn container performs to localize some data (e.g. spark jar, application jar) • Data localization incurs high disk I/O, so • Solution • Prefetch if possible • Disk isolation, SSD Spark Issues: Scale-out (1/2) Existing App New App Disk Access disk during localization (50-70 MB/sec) Performance Degradation
  • 30. • RDD Replication • When a new executor is added, a receiver requests to replicate received RDD blocks to the executor • New executor does not ready to receive RDD blocks during an initialization • At this situation, the receiver waits until new executor is ready • Solution • A receiver does not replicate RDD blocks to new executor during initialization • E.g., spark.blockmanager.peer.bootstrap = 10s Spark Issues: Scale-out (2/2) Kafka R RT R R Executor R R T R Executor 1 R T R Executor 2 RT T Receiver Task Task R RDDReplicate factor = 2
  • 31. • Demo Environment • Data source: Kafka • Auto-scaling parameter • α=0.4, β=0.9 • SQL query • SELECT t.word, COUNT(DISTINCT t.num), SUM(t.num), AVG(t.num), MIN(t.num), MAX(t.num) FROM (SELECT * FROM t_kafka) OVER (WINDOW 300 SECONDS, SLIDE 3 SECONDS) AS t GROUP BY t.word Demo (1/2) Task Task Producer Rcvr 8 3 4 2 5 7 9 1 0 3 7 8 Task Driver 0: 23 1: 34 2: 41 … Flint Monitoring Scale-in/Out
  • 33. Future Work • Streaming SQL • Apply Tungsten Framework • Small-sized object, code gen, GC free memory management • Share RDDs • Map stream tables to RDD • Auto Scaling • Support to dynamically allocate resources for batch (non-streaming) applications • Support unified schedulers for heterogeneous applications • Batch, streaming, ad-hoc