Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming

•

0 likes•994 views

Akhil Das

About Me and Sigmoid
● GitHub: github.com/akhld
● Twitter: @AkhlD
● Email: akhil@sigmoidanalytics.com OUR CUSTOMERS

Overview
● Apache Spark
● Spark Streaming
● High Availability Mesos Cluster
● Running Spark Streaming over a High Availability Mesos Cluster
● Simple Fault-tolerant Streaming Pipeline
● Scaling the pipeline

Apache Spark
Spark Stack
Resilient Distributed Datasets (RDDs)
- Big collection of data which is:
- Immutable
- Distributed
- Lazily evaluated
- Type Inferred
- Cacheable
RDD1 RDD2 RDD3

Why Spark Streaming?
Many big-data applications need to process large data streams in near-
real time
Monitoring Systems
Alert Systems
Computing Systems

What is Spark Streaming?
Taken from Apache Spark.

What is Spark Streaming?
Framework for large scale stream processing
➔ Created at UC Berkeley by Tathagata Das (TD)
➔ Scales to 100s of nodes
➔ Can achieve second scale latencies
➔ Provides a simple batch-like API for implementing complex algorithm
➔ Can absorb live data streams from Kafka, Flume, ZeroMQ, Kinesis etc.

Framework (SparkStreamingJob)
Spark Streaming
Run a streaming computation as a series of very small, deterministic batch jobs
SparkStreaming
Spark
- Chop up the live stream into batches of X seconds
- Spark treats each batch of data as RDDs
and processes them using RDD operations
- Finally, the processed results of the RDD
operations are returned in batches

Kafka Server
Simple Streaming Pipeline
Spark Streaming
Standalone Spark Cluster
Storage
(HDFS/DB)
Point of Failure

Mesos High Availability Cluster
Masters Quorum
Leader
Standby Standby
SparkStreamingJob
Executor
Task
Executor
Task
HadoopJob
Slave 1
Slave N
Offer
Offer
Framework
(SparkStreamingJob)
Driver program
Scheduler
Offer

Spark Streaming over a HA Mesos Cluster
● To use Mesos from Spark, you need a Spark binary package available in a
place accessible (http/s3/hdfs) by Mesos, and a Spark driver program
configured to connect to Mesos.
● Configuring the driver program to connect to Mesos:
val sconf = new SparkConf()
.setMaster("mesos://zk://10.121.93.241:2181,10.181.2.12:2181,10.107.48.112:2181/mesos")
.setAppName("MyStreamingApp")
.set("spark.executor.uri","hdfs://Sigmoid/executors/spark-1.3.0-bin-hadoop2.4.tgz")
.set("spark.mesos.coarse", "true")
.set("spark.cores.max", "30")
.set("spark.executor.memory", "10g")
val sc = new SparkContext(sconf)
val ssc = new StreamingContext(sc, Seconds(1))
...

Spark Streaming Fault-tolerance
Real-time stream processing systems must be operational 24/7, which requires
them to recover from all kinds of failures in the system.
● Spark and its RDD abstraction is designed to seamlessly handle failures of any
worker nodes in the cluster.
● In Streaming, driver failure can be recovered with checkpointing application state.
● Write Ahead Logs (WAL) & Acknowledgements can ensure 0 data loss.

Kafka Cluster
Simple Fault-tolerant Streaming Infra
Spark Streaming
Storage
(HDFS/DB)
N
O
D
E
S
High Availability Mesos Cluster
Executor
Task
SparkStreamingJob

Scaling the pipeline
Spark Streaming
Storage
(HDFS/DB)
N
O
D
E
S
High Availability Mesos Cluster
Understanding the bottlenecks
- Network : 1Gbps
- # Cores/Slave : 4
- DISK IO : 100MB/S on SSD
Goal:
- Receive & Process data at 1M events/Second
Choosing the correct # Resources
- Since single slave can handle up to
100MB/S network and disk IO, a minimal of
6 slaves could take me to ~600MB/S
Kafka
Cluster

Thank You
&
Queries??
Read more: https://www.sigmoid.com/fault-tolerant-streaming-workflows/

What's hot

Delivering High-Availability Web Services with NGINX Plus on AWSNGINX, Inc.

AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the CloudSharma Podila

MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!MongoDB

ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia

Apache cassandra nioKazutaka Tomita

Productionizing Spark and the Spark Job ServerEvan Chan

Deploying Docker Containers at Scale with Mesos and MarathonDiscover Pinterest

Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Docker, Inc.

PaaSTA: Autoscaling at YelpNathan Handler

Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019confluent

Lab Manual Combaring Redis with RelationalAmazon Web Services

Terraform on AzureMithun Shanbhag

Kafka ops-newAriel Moskovich

Lab Manual Managed Database BasicsAmazon Web Services

Big data and hadoop training - Session 5hkbhadraa

Using Redis at FacebookRedis Labs

Kafka & Hadoop - for NYC Kafka MeetupGwen (Chen) Shapira

Developing with the Go client for Apache KafkaJoe Stein

Sanger OpenStack presentation March 2017Dave Holland

Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Chris Fregly

What's hot (20)

Delivering High-Availability Web Services with NGINX Plus on AWS

AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud

MongoDB .local Bengaluru 2019: Becoming an Ops Manager Backup Superhero!

ZooKeeper - wait free protocol for coordinating processes

Apache cassandra nio

Productionizing Spark and the Spark Job Server

Deploying Docker Containers at Scale with Mesos and Marathon

Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)

PaaSTA: Autoscaling at Yelp

Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019

Lab Manual Combaring Redis with Relational

Terraform on Azure

Kafka ops-new

Lab Manual Managed Database Basics

Big data and hadoop training - Session 5

Using Redis at Facebook

Kafka & Hadoop - for NYC Kafka Meetup

Developing with the Go client for Apache Kafka

Sanger OpenStack presentation March 2017

Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...

Similar to Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming

Fully fault tolerant real time data pipeline with docker and mesos Rahul Kumar

Fast Data Analytics with Spark and PythonBenjamin Bengfort

Containerized Data Persistence on MesosJoe Stein

Real time data pipeline with spark streaming and cassandra with mesosRahul Kumar

Realtime Data Pipeline with Spark Streaming and Cassandra with Mesos (Rahul K...DataStax

Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson

Sparkstreaming with kafka and h base at scale (1)Sigmoid

Apache spark - InstallationMartin Zapletal

Learning spark ch07 - Running on a Clusterphanleson

Devops Spark StreamingMarilyn Waldman

SMACK Stack 1.1Joe Stein

Hadoop Spark Introduction-20150130Xuan-Chao Huang

Module01NPN Training

Spark 101 - First steps to distributed computingDemi Ben-Ari

Getting Started with Spark StreamingAlex Apollonsky

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson

In Memory Analytics with Apache SparkVenkata Naga Ravi

How to deploy Apache Spark  to Mesos/DCOSLegacy Typesafe (now Lightbend)

Spark corePrashant Gupta

Episode 3: Kubernetes and Big Data ServicesMesosphere Inc.

Similar to Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming (20)

Fully fault tolerant real time data pipeline with docker and mesos

Fast Data Analytics with Spark and Python

Containerized Data Persistence on Mesos

Real time data pipeline with spark streaming and cassandra with mesos

Realtime Data Pipeline with Spark Streaming and Cassandra with Mesos (Rahul K...

Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)

Sparkstreaming with kafka and h base at scale (1)

Apache spark - Installation

Learning spark ch07 - Running on a Cluster

Devops Spark Streaming

SMACK Stack 1.1

Hadoop Spark Introduction-20150130

Module01

Spark 101 - First steps to distributed computing

Getting Started with Spark Streaming

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...

In Memory Analytics with Apache Spark

How to deploy Apache Spark  to Mesos/DCOS

Spark core

Episode 3: Kubernetes and Big Data Services

Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming

1. Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming AkhilDas akhil@sigmoidanalytics.com

2. About Me and Sigmoid ● GitHub: github.com/akhld ● Twitter: @AkhlD ● Email: akhil@sigmoidanalytics.com OUR CUSTOMERS

3. Overview ● Apache Spark ● Spark Streaming ● High Availability Mesos Cluster ● Running Spark Streaming over a High Availability Mesos Cluster ● Simple Fault-tolerant Streaming Pipeline ● Scaling the pipeline

4. Apache Spark Spark Stack Resilient Distributed Datasets (RDDs) - Big collection of data which is: - Immutable - Distributed - Lazily evaluated - Type Inferred - Cacheable RDD1 RDD2 RDD3

5. Why Spark Streaming? Many big-data applications need to process large data streams in near- real time Monitoring Systems Alert Systems Computing Systems

6. What is Spark Streaming? Taken from Apache Spark.

7. What is Spark Streaming? Framework for large scale stream processing ➔ Created at UC Berkeley by Tathagata Das (TD) ➔ Scales to 100s of nodes ➔ Can achieve second scale latencies ➔ Provides a simple batch-like API for implementing complex algorithm ➔ Can absorb live data streams from Kafka, Flume, ZeroMQ, Kinesis etc.

8. Framework (SparkStreamingJob) Spark Streaming Run a streaming computation as a series of very small, deterministic batch jobs SparkStreaming Spark - Chop up the live stream into batches of X seconds - Spark treats each batch of data as RDDs and processes them using RDD operations - Finally, the processed results of the RDD operations are returned in batches

9. Kafka Server Simple Streaming Pipeline Spark Streaming Standalone Spark Cluster Storage (HDFS/DB) Point of Failure

10. Mesos High Availability Cluster Masters Quorum Leader Standby Standby SparkStreamingJob Executor Task Executor Task HadoopJob Slave 1 Slave N Offer Offer Framework (SparkStreamingJob) Driver program Scheduler Offer

11.

12. Spark Streaming over a HA Mesos Cluster ● To use Mesos from Spark, you need a Spark binary package available in a place accessible (http/s3/hdfs) by Mesos, and a Spark driver program configured to connect to Mesos. ● Configuring the driver program to connect to Mesos: val sconf = new SparkConf() .setMaster("mesos://zk://10.121.93.241:2181,10.181.2.12:2181,10.107.48.112:2181/mesos") .setAppName("MyStreamingApp") .set("spark.executor.uri","hdfs://Sigmoid/executors/spark-1.3.0-bin-hadoop2.4.tgz") .set("spark.mesos.coarse", "true") .set("spark.cores.max", "30") .set("spark.executor.memory", "10g") val sc = new SparkContext(sconf) val ssc = new StreamingContext(sc, Seconds(1)) ...

13. Spark Streaming Fault-tolerance Real-time stream processing systems must be operational 24/7, which requires them to recover from all kinds of failures in the system. ● Spark and its RDD abstraction is designed to seamlessly handle failures of any worker nodes in the cluster. ● In Streaming, driver failure can be recovered with checkpointing application state. ● Write Ahead Logs (WAL) & Acknowledgements can ensure 0 data loss.

14. Kafka Cluster Simple Fault-tolerant Streaming Infra Spark Streaming Storage (HDFS/DB) N O D E S High Availability Mesos Cluster Executor Task SparkStreamingJob

15. Scaling the pipeline Spark Streaming Storage (HDFS/DB) N O D E S High Availability Mesos Cluster Understanding the bottlenecks - Network : 1Gbps - # Cores/Slave : 4 - DISK IO : 100MB/S on SSD Goal: - Receive & Process data at 1M events/Second Choosing the correct # Resources - Since single slave can handle up to 100MB/S network and disk IO, a minimal of 6 slaves could take me to ~600MB/S Kafka Cluster

16. Thank You & Queries?? Read more: https://www.sigmoid.com/fault-tolerant-streaming-workflows/

Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming

Similar to Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming (20)

Fully Fault tolerant Streaming Workflows at Scale using Apache Mesos & Spark Streaming