Storm over gearpump

77 views

Published on

How to support binary Storm compatibility on Gearpump

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
77
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Storm over gearpump

  1. 1. Storm over Gearpump Manu Zhang
  2. 2. About Me ● Software Engineer at Big Data Technology Group, Intel ● contributor to hadoop-nativetask (https://github.com/apache/hadoop/tree/trunk/hadoop-mapreduce-project/hadoop -mapreduce-client/hadoop-mapreduce-client-nativetask) ● written storm-benchmark (https://github.com/intel-hadoop/storm-benchmark) ● working on Gearpump (kafka connector, storm compatibility, state) ● maintain a list of awesome-streaming (https://github.com/manuzhang/awesome-streaming)
  3. 3. Gearpump - Distributed Real-time Streaming Engine ● Akka / Actor Model ● Dynamic DAG ● Flow Control / Backpressure ● Low watermark ● At least once / Exactly-once ● High Availability ● Interactive Web UI
  4. 4. Gearpump Updates ● released 0.6.1 & 0.7.0 ● new documentation site ● secure YARN and HBase support ● Akka-stream compatibility ● Storm binary compatibility
  5. 5. Storm over Gearpump - Why ● Storm is widely used in the industry ● but has its limitations ● Gearpump is designed to overcome those limitations ● We want Storm users to benefit from Gearpump’s advanced features without any cost
  6. 6. Storm over Gearpump - Features ● binary compatible (no recompilation is required) with Storm 0.9.x ● multi-lang ● DRPC ● KafkaSpout / KafkaBolt ● Trident (WIP)
  7. 7. Similarities of Gearpump and Storm ● per-message ● Topology / DAG ● similar user interface
  8. 8. Storm over Gearpump - Overview
  9. 9. Storm over Gearpump - DAG Translation
  10. 10. Storm over Gearpump - Task Execution
  11. 11. Storm over Gearpump - Flow Control ● Acker is removed ● Flow control with back pressure for both acked and unacked Storm topologies
  12. 12. Storm over Gearpump - At Least Once ● each message is tagged with system time ● asynchronous non-blocking ack through tracking global minimum ack time ● support KafkaSpout for now
  13. 13. Performance ● SOL from storm-benchmark ● Storm 0.9.6 ● 4-node 10GbE cluster ● 16 workers ● 48 Spouts and 48 Bolts
  14. 14. Future work ● submit Storm Job through Web UI ● Storm 0.10 support ● At Least Once support for more spouts ● Trident support
  15. 15. References 1. https://github.com/gearpump/gearpump 2. https://gearpump.io 3. https://storm.apache.org 4. How to use Akka to make a PERFECT Streaming system 5. https://www.typesafe.com/blog/gearpump-real-time-streaming-engine-using-akka 6. http://akka.io/docs/

×