Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Storm over gearpump

349 views

Published on

How to support binary Storm compatibility on Gearpump

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Storm over gearpump

  1. 1. Storm over Gearpump Manu Zhang
  2. 2. About Me ● Software Engineer at Big Data Technology Group, Intel ● contributor to hadoop-nativetask (https://github.com/apache/hadoop/tree/trunk/hadoop-mapreduce-project/hadoop -mapreduce-client/hadoop-mapreduce-client-nativetask) ● written storm-benchmark (https://github.com/intel-hadoop/storm-benchmark) ● working on Gearpump (kafka connector, storm compatibility, state) ● maintain a list of awesome-streaming (https://github.com/manuzhang/awesome-streaming)
  3. 3. Gearpump - Distributed Real-time Streaming Engine ● Akka / Actor Model ● Dynamic DAG ● Flow Control / Backpressure ● Low watermark ● At least once / Exactly-once ● High Availability ● Interactive Web UI
  4. 4. Gearpump Updates ● released 0.6.1 & 0.7.0 ● new documentation site ● secure YARN and HBase support ● Akka-stream compatibility ● Storm binary compatibility
  5. 5. Storm over Gearpump - Why ● Storm is widely used in the industry ● but has its limitations ● Gearpump is designed to overcome those limitations ● We want Storm users to benefit from Gearpump’s advanced features without any cost
  6. 6. Storm over Gearpump - Features ● binary compatible (no recompilation is required) with Storm 0.9.x ● multi-lang ● DRPC ● KafkaSpout / KafkaBolt ● Trident (WIP)
  7. 7. Similarities of Gearpump and Storm ● per-message ● Topology / DAG ● similar user interface
  8. 8. Storm over Gearpump - Overview
  9. 9. Storm over Gearpump - DAG Translation
  10. 10. Storm over Gearpump - Task Execution
  11. 11. Storm over Gearpump - Flow Control ● Acker is removed ● Flow control with back pressure for both acked and unacked Storm topologies
  12. 12. Storm over Gearpump - At Least Once ● each message is tagged with system time ● asynchronous non-blocking ack through tracking global minimum ack time ● support KafkaSpout for now
  13. 13. Performance ● SOL from storm-benchmark ● Storm 0.9.6 ● 4-node 10GbE cluster ● 16 workers ● 48 Spouts and 48 Bolts
  14. 14. Future work ● submit Storm Job through Web UI ● Storm 0.10 support ● At Least Once support for more spouts ● Trident support
  15. 15. References 1. https://github.com/gearpump/gearpump 2. https://gearpump.io 3. https://storm.apache.org 4. How to use Akka to make a PERFECT Streaming system 5. https://www.typesafe.com/blog/gearpump-real-time-streaming-engine-using-akka 6. http://akka.io/docs/

×