Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End

2,989 views

Published on

Why have stream-oriented data systems become so popular, when batch-oriented systems have served big data needs for many years? Batch-mode processing isn’t going away, but exclusive use of these systems is now a competitive disadvantage. You’ll learn that, while fast data architectures are much harder to build, they represent the state of the art for dealing with mountains of data that require immediate attention.

In this webinar, Lightbend’s Big Data Architect, Dr. Dean Wampler, examines the rise of streaming systems for handling time-sensitive problems. We’ll explore the characteristics of fast data architectures, and the open source tools for implementing them.

We’ll also take a brief look at Lightbend’s upcoming Fast Data Platform (FDP - http://lightbend.com/fast-data-platform ), a comprehensive solution of OSS and commercial technologies. FDP includes installation, integration, and monitoring tools tuned for various deployment scenarios, plus sample applications to help you sort out which tools to use for which purposes.

We’ll cover:

*Learn step-by-step how a basic fast data architecture works
*Understand why event logs are the core abstraction for streaming architectures, while message queues are the core integration tool
*Use methods for analyzing infinite data sets, where you don’t have all the data and never will
*Take a tour of open source streaming engines, and discover which ones work best for different use cases
*Get recommendations for making real-world streaming system responsive, resilient, elastic, and message driven
*Explore an example streaming application for the IoT: telemetry ingestion and anomaly detection for home automation systems

LEARN MORE: lightbend.com/fast-data-platform

Published in: Software
  • Be the first to comment

Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End

  1. 1. WEBINAR Fast Data: Selecting The Right Streaming Technologies For Data Sets That Never End Dr. Dean Wampler (@deanwampler)
  2. 2. Upgrade your grey matter
 Get Dean’s free O’Reilly book from Lightbend bit.ly/fastdata-ORbook
  3. 3. Streaming Engines in Context…
  4. 4. Classic Batch Architecture: Hadoop
  5. 5. Logs Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search YARN Resource Manager Node Manager N M Batch MapReduce … Spark Flume SqoopDBs
  6. 6. Logs Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search YARN Resource Manager Node Manager N M Batch MapReduce … Spark Flume SqoopDBs
  7. 7. Logs Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search YARN Resource Manager Node Manager N M Batch MapReduce … Spark Flume SqoopDBs
  8. 8. Logs Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search YARN Resource Manager Node Manager N M Batch MapReduce … Spark Flume SqoopDBs
  9. 9. Logs Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search YARN Resource Manager Node Manager N M Batch MapReduce … Spark Flume SqoopDBs
  10. 10. New Streaming, “Fast Data” Architecture (but it also supports batch)
  11. 11. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  12. 12. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  13. 13. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  14. 14. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  15. 15. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  16. 16. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  17. 17. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  18. 18. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam
  19. 19. Streaming Engines
  20. 20. Features to Consider
  21. 21. • Low latency? How low? • …
  22. 22. • Low latency? How low? • High volume? How high? • …
  23. 23. • Low latency? How low? • High volume? How high? • Integration with other tools? Which ones? • …
  24. 24. • Low latency? How low? • High volume? How high? • Integration with other tools? Which ones? • Kinds of data processing, analytics? Which ones? •Bulk processing of records? •Individual processing of events?
  25. 25. Example Streaming Engines
  26. 26. Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 7 8 9 10 Beam
  27. 27. Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 7 8 9 10 Beam
  28. 28. Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 7 8 9 10 Beam
  29. 29. Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 7 8 9 10 Beam
  30. 30. Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Gearpump Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 7 8 9 10 Beam
  31. 31. Upgrade your grey matter
 Get Dean’s free O’Reilly book from Lightbend http://bit.ly/fastdata-ORbook
  32. 32. For more information on Lightbend Fast Data Platform: lightbend.com/fast-data-platform

×