Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Four Things to Know about
Reliable Spark Streaming
Dean Wampler, Typesafe
Tathagata Das, Databricks
Agenda for today
• The Stream Processing Landscape
• How Spark Streaming Works - A Quick Overview
• Features in Spark Stre...
The Stream
Processing Landscape
Stream Processors
Stream Storage
Stream Sources
MQTT$
How Spark Streaming Works:
A Quick Overview
Spark Streaming
Scalable, fault-tolerant stream processing system
File systems
Databases
Dashboards
Flume
Kinesis
HDFS/S3
...
Spark Streaming
Receivers receive data streams and chop them up into batches
Spark processes the batches and pushes out th...
Word Count with Kafka
val!context!=!new!StreamingContext(conf,!Seconds(1))!
val!lines!=!KafkaUtils.createStream(context,!....
Word Count with Kafka
val!context!=!new!StreamingContext(conf,!Seconds(1))!
val!lines!=!KafkaUtils.createStream(context,!....
Word Count with Kafka
val!context!=!new!StreamingContext(conf,!Seconds(1))!
val!lines!=!KafkaUtils.createStream(context,!....
Word Count with Kafka
object!WordCount!{!
!!def!main(args:!Array[String])!{!
!!!!val!context!=!new!StreamingContext(new!Sp...
Features in Spark Streaming
that Help Prevent Data Loss
15
A Deeper View of
Spark Streaming
Any Spark Application
16
Spark
Driver
User code runs in
the driver process
YARN / Mesos /
Spark Standalone
cluster
Any Spark Application
17
Spark
Driver
User code runs in
the driver process
YARN / Mesos /
Spark Standalone
cluster
Spark
E...
Any Spark Application
18
Spark
Driver
User code runs in
the driver process
YARN / Mesos /
Spark Standalone
cluster
Tasks s...
Spark Streaming Application: Receive data
19
Executor
Executor
Driver runs
receivers as long
running tasks Receiver Data s...
Spark Streaming Application: Receive data
20
Executor
Executor
Driver runs
receivers as long
running tasks Receiver Data s...
Spark Streaming Application: Receive data
21
Executor
Executor
Driver runs
receivers as long
running tasks Receiver Data s...
Spark Streaming Application: Process data
22
Executor
Executor
Receiver
Data Blocks!
Data Blocks!
Every batch
interval, dr...
Spark Streaming Application: Process data
23
Executor
Executor
Receiver
Data Blocks!
Data Blocks!
Data
store
Every batch
i...
Fault Tolerance
and Reliability
Failures? Why care?
Many streaming applications need zero data loss guarantees
despite any kind of failures in the system
...
Executor
Receiver
Data Blocks!
What if an executor fails?
26
Executor
Failed Ex.
Receiver
Blocks!
Blocks!
Driver
If execut...
Executor
Receiver
Data Blocks!
What if an executor fails?
Tasks and receivers restarted by Spark
automatically, no config ...
What if the driver fails?
28
Executor
Blocks!How do we
recover?
When the driver
fails, all the
executors fail
All computat...
Recovering Driver w/ DStream Checkpointing
DStream Checkpointing:
Periodically save the DAG of DStreams to
fault-tolerant ...
Recovering Driver w/ DStream Checkpointing
30
Failed driver can be restarted
from checkpoint information
Failed
Driver
Res...
Recovering Driver w/ DStream Checkpointing
31
Failed driver can be restarted
from checkpoint information
Failed
Driver
Res...
Recovering Driver w/ DStream Checkpointing
1.  Configure automatic driver restart
All cluster managers support this
2.  Se...
Configurating Automatic Driver Restart
Spark Standalone – Use spark-submit with “cluster” mode and “--
supervise”
See http...
Restructuring code for Checkpointing
34
val!context!=!new!StreamingContext(...)!
val!lines!=!KafkaUtils.createStream(...)!...
Restructuring code for Checkpointing
35
val!context!=!new!StreamingContext(...)!
val!lines!=!KafkaUtils.createStream(...)!...
Restructuring code for Checkpointing
36
val!context!=!new!StreamingContext(...)!
val!lines!=!KafkaUtils.createStream(...)!...
Restructuring code for Checkpointing
StreamingContext.getOrCreate():
If HDFS directory has checkpoint info
recover context...
Received blocks lost on Restart!
38
Failed
Driver
Restarted
Driver
New
Executor
New Ex.
Receiver
No Blocks!
In-memory bloc...
Recovering data with Write Ahead Logs
Write Ahead Log (WAL): Synchronously
save received data to fault-tolerant storage
39...
Recovering data with Write Ahead Logs
40
Failed
Driver
Restarted
Driver
New
Executor
New Ex.
Receiver
Blocks!
Blocks recov...
Recovering data with Write Ahead Logs
1.  Enable checkpointing, logs written in checkpoint directory
3.  Enabled WAL in Sp...
RDD Checkpointing
•  Stateful stream processing can lead to long RDD lineages
•  Long lineage = bad for fault-tolerance, t...
Fault-tolerance Semantics
43
Zero data loss = every stage processes each
event at least once despite any failure
Sources
T...
Fault-tolerance Semantics
44
Sources
Transforming
Sinks
Outputting
Receiving
Exactly once, as long as received data is not...
Fault-tolerance Semantics
45
Sources
Transforming
Sinks
Outputting
Receiving
Exactly once, as long as received data is not...
Fault-tolerance Semantics
46
Sources
Transforming
Sinks
Outputting
Receiving
Exactly once, as long as received data is not...
Fault-tolerance Semantics
47
Exactly once receiving with new Kafka Direct approach
Treats Kafka like a replicated log, rea...
Fault-tolerance Semantics
48
Exactly once receiving with new Kafka Direct approach
Sources
Transforming
Sinks
Outputting
R...
Design Tips for Successful
Streaming Applications
Areas for consideration
• Enhance resilience with additional
components.
• Mini-batch vs. per-message handling.
• Exploit ...
• Use Storm, Akka, Samza, etc. for handling
individual messages, especially with sub-
second latency requirements.
• Use S...
• Consider Kafka or Kinesis for resilient buffering
in front of Spark Streaming.
•  Buffer for traffic spikes.
•  Re-retri...
• Spark Streaming v1.5 will have support for back pressure to more
easily build end-to-end reactive applications
Exploit R...
• Spark Streaming v1.5 will have support for back
pressure to more easily build end-to-end reactive
applications
• Backpre...
• Spark Streaming v1.4? Buffer with Akka
Streams:
Exploit Reactive Streams
Event
Source
Akka
App
Spark
Streaming App
Event...
• Spark Streaming v1.4 has a rate limit property:
•  spark.streaming.receiver.maxRate
•  Consider setting it for long-runn...
Thank you!
Dean Wampler, Typesafe
Tathagata Das, Databricks
Databricks is
available as a hosted
platform on 
AWS with a monthly
subscription. 
What to do next?
Start with a free tria...
REACTIVE PLATFORM

Full Lifecycle Support for Play, Akka, Scala and Spark
Give your project a boost with Reactive Platform...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Next
Upcoming SlideShare
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Next
Download to read offline and view in fullscreen.

Share

Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks

Download to read offline

Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. In this webinar, developers will learn:

*How Spark Streaming works - a quick review.
*Features in Spark Streaming that help prevent potential data loss.
*Complementary tools in a streaming pipeline - Kafka and Akka.
*Design and tuning tips for Reactive Spark Streaming applications.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks

  1. 1. Four Things to Know about Reliable Spark Streaming Dean Wampler, Typesafe Tathagata Das, Databricks
  2. 2. Agenda for today • The Stream Processing Landscape • How Spark Streaming Works - A Quick Overview • Features in Spark Streaming that Help Prevent Data Loss • Design Tips for Successful Streaming Applications
  3. 3. The Stream Processing Landscape
  4. 4. Stream Processors
  5. 5. Stream Storage
  6. 6. Stream Sources MQTT$
  7. 7. How Spark Streaming Works: A Quick Overview
  8. 8. Spark Streaming Scalable, fault-tolerant stream processing system File systems Databases Dashboards Flume Kinesis HDFS/S3 Kafka Twitter High-level API joins, windows, … often 5x less code Fault-tolerant Exactly-once semantics, even for stateful ops Integration Integrates with MLlib, SQL, DataFrames, GraphX
  9. 9. Spark Streaming Receivers receive data streams and chop them up into batches Spark processes the batches and pushes out the results 9 data streams receivers batches results
  10. 10. Word Count with Kafka val!context!=!new!StreamingContext(conf,!Seconds(1))! val!lines!=!KafkaUtils.createStream(context,!...)! 10 entry point of streaming functionality create DStream from Kafka data
  11. 11. Word Count with Kafka val!context!=!new!StreamingContext(conf,!Seconds(1))! val!lines!=!KafkaUtils.createStream(context,!...)! val!words!=!lines.flatMap(_.split("!"))! 11 split lines into words
  12. 12. Word Count with Kafka val!context!=!new!StreamingContext(conf,!Seconds(1))! val!lines!=!KafkaUtils.createStream(context,!...)! val!words!=!lines.flatMap(_.split("!"))! val!wordCounts!=!words.map(x!=>!(x,!1))! !!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! wordCounts.print()! context.start()! 12 print some counts on screen count the words start receiving and transforming the data
  13. 13. Word Count with Kafka object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(new!SparkConf(),!Seconds(1))! !!!!val!lines!=!KafkaUtils.createStream(context,!...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1)).reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }! 13
  14. 14. Features in Spark Streaming that Help Prevent Data Loss
  15. 15. 15 A Deeper View of Spark Streaming
  16. 16. Any Spark Application 16 Spark Driver User code runs in the driver process YARN / Mesos / Spark Standalone cluster
  17. 17. Any Spark Application 17 Spark Driver User code runs in the driver process YARN / Mesos / Spark Standalone cluster Spark Executor Spark Executor Spark Executor Driver launches executors in cluster
  18. 18. Any Spark Application 18 Spark Driver User code runs in the driver process YARN / Mesos / Spark Standalone cluster Tasks sent to executors for processing data Spark Executor Spark Executor Spark Executor Driver launches executors in cluster
  19. 19. Spark Streaming Application: Receive data 19 Executor Executor Driver runs receivers as long running tasks Receiver Data stream Driver@ object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(...)! !!!!val!lines!=!KafkaUtils.createStream(...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1))! !!!!!!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }!
  20. 20. Spark Streaming Application: Receive data 20 Executor Executor Driver runs receivers as long running tasks Receiver Data stream Driver@ object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(...)! !!!!val!lines!=!KafkaUtils.createStream(...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1))! !!!!!!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }! Receiver divides stream into blocks and keeps in memory Data Blocks!
  21. 21. Spark Streaming Application: Receive data 21 Executor Executor Driver runs receivers as long running tasks Receiver Data stream Driver@ object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(...)! !!!!val!lines!=!KafkaUtils.createStream(...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1))! !!!!!!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }! Receiver divides stream into blocks and keeps in memory Data Blocks! Blocks also replicated to another executor Data Blocks!
  22. 22. Spark Streaming Application: Process data 22 Executor Executor Receiver Data Blocks! Data Blocks! Every batch interval, driver launches tasks to process the blocks Driver@ object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(...)! !!!!val!lines!=!KafkaUtils.createStream(...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1))! !!!!!!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }!
  23. 23. Spark Streaming Application: Process data 23 Executor Executor Receiver Data Blocks! Data Blocks! Data store Every batch interval, driver launches tasks to process the blocks Driver@ object!WordCount!{! !!def!main(args:!Array[String])!{! !!!!val!context!=!new!StreamingContext(...)! !!!!val!lines!=!KafkaUtils.createStream(...)! !!!!val!words!=!lines.flatMap(_.split("!"))! !!!!val!wordCounts!=!words.map(x!=>!(x,1))! !!!!!!!!!!!!!!!!!!!!!!!!!!.reduceByKey(_!+!_)! !!!!wordCounts.print()! !!!!context.start()! !!!!context.awaitTermination()! !!}! }!
  24. 24. Fault Tolerance and Reliability
  25. 25. Failures? Why care? Many streaming applications need zero data loss guarantees despite any kind of failures in the system At least once guarantee – every record processed at least once Exactly once guarantee – every record processed exactly once Different kinds of failures – executor and driver Some failures and guarantee requirements need additional configurations and setups 25
  26. 26. Executor Receiver Data Blocks! What if an executor fails? 26 Executor Failed Ex. Receiver Blocks! Blocks! Driver If executor fails, receiver is lost and all blocks are lost
  27. 27. Executor Receiver Data Blocks! What if an executor fails? Tasks and receivers restarted by Spark automatically, no config needed 27 Executor Failed Ex. Receiver Blocks! Blocks! Driver If executor fails, receiver is lost and all blocks are lost Receiver Receiver restarted Tasks restarted on block replicas
  28. 28. What if the driver fails? 28 Executor Blocks!How do we recover? When the driver fails, all the executors fail All computation, all received blocks are lost Executor Receiver Blocks! Failed Ex. Receiver Blocks! Failed Executor Blocks! Driver Failed Driver
  29. 29. Recovering Driver w/ DStream Checkpointing DStream Checkpointing: Periodically save the DAG of DStreams to fault-tolerant storage 29 Executor Blocks! Executor Receiver Blocks! Active Driver Checkpoint info to HDFS / S3
  30. 30. Recovering Driver w/ DStream Checkpointing 30 Failed driver can be restarted from checkpoint information Failed Driver Restarted Driver DStream Checkpointing: Periodically save the DAG of DStreams to fault-tolerant storage
  31. 31. Recovering Driver w/ DStream Checkpointing 31 Failed driver can be restarted from checkpoint information Failed Driver Restarted Driver New Executor New Executor Receiver New executors launched and receivers restarted DStream Checkpointing: Periodically save the DAG of DStreams to fault-tolerant storage
  32. 32. Recovering Driver w/ DStream Checkpointing 1.  Configure automatic driver restart All cluster managers support this 2.  Set a checkpoint directory in a HDFS-compatible file system !streamingContext.checkpoint(hdfsDirectory)! 3.  Slightly restructure of the code to use checkpoints for recovery 32
  33. 33. Configurating Automatic Driver Restart Spark Standalone – Use spark-submit with “cluster” mode and “-- supervise” See http://spark.apache.org/docs/latest/spark-standalone.html YARN – Use spark-submit in “cluster” mode See YARN config “yarn.resourcemanager.am.max-attempts” Mesos – Marathon can restart applications or use the “--supervise” flag. 33
  34. 34. Restructuring code for Checkpointing 34 val!context!=!new!StreamingContext(...)! val!lines!=!KafkaUtils.createStream(...)! val!words!=!lines.flatMap(...)! ...! context.start()! Create + Setup Start
  35. 35. Restructuring code for Checkpointing 35 val!context!=!new!StreamingContext(...)! val!lines!=!KafkaUtils.createStream(...)! val!words!=!lines.flatMap(...)! ...! context.start()! Create + Setup Start def!creatingFunc():!StreamingContext!=!{!! !!!val!context!=!new!StreamingContext(...)!!! !!!val!lines!=!KafkaUtils.createStream(...)! !!!val!words!=!lines.flatMap(...)! !!!...! !!!context.checkpoint(hdfsDir)! }! Put all setup code into a function that returns a new StreamingContext
  36. 36. Restructuring code for Checkpointing 36 val!context!=!new!StreamingContext(...)! val!lines!=!KafkaUtils.createStream(...)! val!words!=!lines.flatMap(...)! ...! context.start()! Create + Setup Start def!creatingFunc():!StreamingContext!=!{!! !!!val!context!=!new!StreamingContext(...)!!! !!!val!lines!=!KafkaUtils.createStream(...)! !!!val!words!=!lines.flatMap(...)! !!!...! !!!context.checkpoint(hdfsDir)! }! Put all setup code into a function that returns a new StreamingContext Get context setup from HDFS dir OR create a new one with the function val!context!=! StreamingContext.getOrCreate(! !!hdfsDir,!creatingFunc)! context.start()!
  37. 37. Restructuring code for Checkpointing StreamingContext.getOrCreate(): If HDFS directory has checkpoint info recover context from info else call creatingFunc() to create and setup a new context Restarted process can figure out whether to recover using checkpoint info or not 37 def!creatingFunc():!StreamingContext!=!{!! !!!val!context!=!new!StreamingContext(...)!!! !!!val!lines!=!KafkaUtils.createStream(...)! !!!val!words!=!lines.flatMap(...)! !!!...! !!!context.checkpoint(hdfsDir)! }! val!context!=! StreamingContext.getOrCreate(! !!hdfsDir,!creatingFunc)! context.start()!
  38. 38. Received blocks lost on Restart! 38 Failed Driver Restarted Driver New Executor New Ex. Receiver No Blocks! In-memory blocks of buffered data are lost on driver restart
  39. 39. Recovering data with Write Ahead Logs Write Ahead Log (WAL): Synchronously save received data to fault-tolerant storage 39 Executor Blocks saved to HDFS Executor Receiver Blocks! Active Driver Data stream
  40. 40. Recovering data with Write Ahead Logs 40 Failed Driver Restarted Driver New Executor New Ex. Receiver Blocks! Blocks recovered from Write Ahead Log Write Ahead Log (WAL): Synchronously save received data to fault-tolerant storage
  41. 41. Recovering data with Write Ahead Logs 1.  Enable checkpointing, logs written in checkpoint directory 3.  Enabled WAL in SparkConf configuration !!!!!sparkConf.set("spark.streaming.receiver.writeAheadLog.enable",!"true")! 3.  Receiver should also be reliable Acknowledge source only after data saved to WAL Unacked data will be replayed from source by restarted receiver 5.  Disable in-memory replication (already replicated by HDFS) !Use StorageLevel.MEMORY_AND_DISK_SER for input DStreams 41
  42. 42. RDD Checkpointing •  Stateful stream processing can lead to long RDD lineages •  Long lineage = bad for fault-tolerance, too much recomputation •  RDD checkpointing saves RDD data to the fault-tolerant storage to limit lineage and recomputation •  More: http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing 42
  43. 43. Fault-tolerance Semantics 43 Zero data loss = every stage processes each event at least once despite any failure Sources Transforming Sinks Outputting Receiving
  44. 44. Fault-tolerance Semantics 44 Sources Transforming Sinks Outputting Receiving Exactly once, as long as received data is not lost Receiving Outputting End-to-end semantics: At-least once
  45. 45. Fault-tolerance Semantics 45 Sources Transforming Sinks Outputting Receiving Exactly once, as long as received data is not lost Receiving Outputting Exactly once, if outputs are idempotent or transactional End-to-end semantics: At-least once
  46. 46. Fault-tolerance Semantics 46 Sources Transforming Sinks Outputting Receiving Exactly once, as long as received data is not lost At least once, w/ Checkpointing + WAL + Reliable receiversReceiving Outputting Exactly once, if outputs are idempotent or transactional End-to-end semantics: At-least once
  47. 47. Fault-tolerance Semantics 47 Exactly once receiving with new Kafka Direct approach Treats Kafka like a replicated log, reads it like a file Does not use receivers No need to create multiple DStreams and union them No need to enable Write Ahead Logs ! @val!directKafkaStream!=!KafkaUtils.createDirectStream(...)! ! https://databricks.com/blog/2015/03/30/improvements-to-kafka-integration-of-spark-streaming.html http://spark.apache.org/docs/latest/streaming-kafka-integration.html Sources Transforming Sinks Outputting Receiving
  48. 48. Fault-tolerance Semantics 48 Exactly once receiving with new Kafka Direct approach Sources Transforming Sinks Outputting Receiving Exactly once, as long as received data is not lost Exactly once, if outputs are idempotent or transactional End-to-end semantics: Exactly once!
  49. 49. Design Tips for Successful Streaming Applications
  50. 50. Areas for consideration • Enhance resilience with additional components. • Mini-batch vs. per-message handling. • Exploit Reactive Streams.
  51. 51. • Use Storm, Akka, Samza, etc. for handling individual messages, especially with sub- second latency requirements. • Use Spark Streaming’s mini-batch model for the Lambda architecture and highly-scalable analytics. Mini-batch vs. per-message handling
  52. 52. • Consider Kafka or Kinesis for resilient buffering in front of Spark Streaming. •  Buffer for traffic spikes. •  Re-retrieval of data if an RDD partition is lost and must be reconstructed from the source. • Going to store the raw data anyway? •  Do it first, then ingest to Spark from that storage. Enhance Resiliency with Additional Components.
  53. 53. • Spark Streaming v1.5 will have support for back pressure to more easily build end-to-end reactive applications Exploit Reactive Streams
  54. 54. • Spark Streaming v1.5 will have support for back pressure to more easily build end-to-end reactive applications • Backpressure from consumer to producer: •  Prevents buffer overflows. •  Avoids unnecessary throttling. Exploit Reactive Streams
  55. 55. • Spark Streaming v1.4? Buffer with Akka Streams: Exploit Reactive Streams Event Source Akka App Spark Streaming App Event Event Event Event feedback (back pressure)
  56. 56. • Spark Streaming v1.4 has a rate limit property: •  spark.streaming.receiver.maxRate •  Consider setting it for long-running streaming apps with a variable input flow rate. • Have a graph of Reactive Streams? Consider using an Akka app to buffer the data fed to Spark Streaming over a socket (until 1.5…). Exploit Reactive Streams
  57. 57. Thank you! Dean Wampler, Typesafe Tathagata Das, Databricks
  58. 58. Databricks is available as a hosted platform on  AWS with a monthly subscription.  What to do next? Start with a free trial Typesafe now offers certified support for Spark, Mesos & DCOS, read more about it READ MORE
  59. 59. REACTIVE PLATFORM
 Full Lifecycle Support for Play, Akka, Scala and Spark Give your project a boost with Reactive Platform: • Monitor Message-Driven Apps • Resolve Network Partitions Decisively • Integrate Easily with Legacy Systems • Eliminate Incompatibility & Security Risks • Protect Apps Against Abuse • Expert Support from Dedicated Product Teams Enjoy learning? See about the availability of on-site training for Scala, Akka, Play and Spark! Learn more about our offersCONTACT US
  • jeevanpingali

    Nov. 21, 2019
  • ShivendraPandey1

    Jul. 15, 2019
  • is014

    Jun. 25, 2019
  • H_Mohammad

    Dec. 12, 2018
  • ElizaldeAllanDeLeon

    Sep. 12, 2018
  • HYDN

    May. 26, 2018
  • nguyencluat3

    Feb. 4, 2018
  • VamsiNamburu1

    Jan. 8, 2018
  • SrikarNelakurthi

    Dec. 11, 2017
  • liweiyang5

    Oct. 16, 2017
  • SubrahmanyamSankaSub

    Aug. 26, 2017
  • shadowl_lee

    Aug. 21, 2017
  • zhb_mccoy

    Jul. 5, 2017
  • AashishSoni4

    Jun. 18, 2017
  • ssusera087a8

    May. 13, 2017
  • kaidul

    Mar. 23, 2017
  • flei_98

    Mar. 4, 2017
  • skale1990

    Feb. 21, 2017
  • MichaelLi100

    Feb. 9, 2017
  • robmarano

    Feb. 8, 2017

Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. In this webinar, developers will learn: *How Spark Streaming works - a quick review. *Features in Spark Streaming that help prevent potential data loss. *Complementary tools in a streaming pipeline - Kafka and Akka. *Design and tuning tips for Reactive Spark Streaming applications.

Views

Total views

61,026

On Slideshare

0

From embeds

0

Number of embeds

33,518

Actions

Downloads

801

Shares

0

Comments

0

Likes

94

×