SlideShare a Scribd company logo
Streaming & Messaging
- Xin Wang
July 2017, Beijing, Apache RocketMQ Meetup
Xin Wang
• Apache Storm Committer & PMC member
• Five years distributed system exprience
• Love open source
• Focus on distributed technologies, especially stream processing
• https://github.com/vesense
Index
Part-1: Distributed Computing and Storage
Part-2: Streaming and Messaging
Part-3: Streaming in User Behavior Analysis
01
Distributed Computing & Storage
「分而治之」
Distributed Computing
Distributed Storage
02
Streaming & Messaging
「取之有道」
Streaming
What's new in Storm 1.0
• Nimbus HA - Failover
• Pacemaker - Replaces Zookeeper for Heartbeats
• Streaming Window API
• Distributed Cache API - Allows sharing of files among topologies
• State Management - Statefule Bolts with Automatic Checkpointing
• Automatic Backpressure
• Resource Aware Scheduler
• Dynamic Log Levels
• Dynamic Worker Profiling
• Streaming SQL(since 1.1)
• ......
What's new in Storm 2.0
• Port Clojure to Java
• Lambda Expression Support - bolt: `tuple -> System.out.println(tuple)`
• Apache Beam Runner
• Storm-SQL Improvements
• Metrics V2 (or release in 1.2)
• Worker/Threading model Redesign
• Worker Classloader Isolation
• Dynamic Topology Updates
• ......
Messaging
03
Streaming in User Behavior Analysis
「用之有方」
User Behavior Analysis 1.0
User Behavior Analysis 2.0
3G/4G
HBase
StormKafka
HDFS/Hive
MySQL
Solr
WLAN
Cable
Topic-In-1
Topic-In-N
Topic-Out-1
Topic-Out-N
Topology-1
Topology-2
Topology-3
Topology-N
Other
User Behavior Analysis 3.0 - Neymar
• Unified streaming API
• Pluggable Tagging module
• Dynamic Rule/Schema update
• UDF support for Schema fields
• Built-in main Serializer/Deserializer
• 16 vcore+32GB RAM *12node
• 60 billion/day, about 700K tps
• No data lose
Issues&Practices
• STORM-643 KafkaUtils repeatedly fetches messages whose offset is out of range
• STORM-329 Fix cascading Storm failure by improving reconnection strategy and buffering messages
• STORM-763 Nimbus reassigned worker A to another machine, but other worker's netty client can't
connect to the new worker A
• Worker down because of some Kafka ZK related bugs KAFKA-1382,KAFKA-1387,KAFKA-1451,KAFKA-
1585
• Be careful incompatible changes when upgrading Storm from 0.9 to 1.0. Scheme changed from
`deserialize(byte[] ser)` to `deserialize(ByteBuffer ser)`
• Worker heavy GCs due to large in-memory cache
• Put the lightweight logics into the same bolt
• Do NOT set too many executors/tasks in one worker, higher parallelism sometimes causes lower
throughput
• MQ->Streaming->MQ is better than MQ->Streaming->HDFS(throughput, delay, stability)
• Never log the logs unnecessay
Future
• Storm 2.0 & Kafka 0.11
• Apache Beam & Open-Messaging
• Investigating new Streaming Engine: Apache Flink, Spark-Streaming
• Investigating new MQ: Apache RocketMQ
• ROCKETMQ-81: RocketMQ-Spark Integration
• ROCKETMQ-82: RocketMQ-Flink Integration
• STORM-2349/ROCKETMQ-79: RocketMQ-Storm Integration
Storm-RocketMQ
• RocketMQSpout - Now only RocketMQ push mode supported, pull mode is in the plan.
The default Deserializer is StringScheme, you can override the value by setting
`RocketMQConfig.SCHEME`.
• RocketMQBolt - Async sending by default, or you can change the value by invoking
`withAsync(boolean async)`
• RocketMQState - For users using Storm Trident API
• TopicSelector - Selecting a topic based on the input Storm tuple
• TupleToMessageMapper - Mapping a Storm tuple to a RocketMQ message, you can
implement the MessageBodySerializer interface to serialize the message body. The
default implementation of MessageBodySerializer is
`body.toString().getBytes(StandardCharsets.UTF_8)`
• MessageRetryManager - Retry policy for failed messages
More details please refer to: https://github.com/apache/storm/tree/master/external/storm-rocketmq
Storm-RocketMQ
Thanks everyone!

More Related Content

What's hot

The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
P. Taylor Goetz
 
Spark streaming and Kafka
Spark streaming and KafkaSpark streaming and Kafka
Spark streaming and Kafka
Iraj Hedayati
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
confluent
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
Guozhang Wang
 
kafka
kafkakafka
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
StreamNative
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
StreamNative
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
Apurva Mehta
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Allen (Xiaozhong) Wang
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
Joan Viladrosa Riera
 
Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
StreamNative
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
confluent
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
confluent
 
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
confluent
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
kawamuray
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxIntroducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Databricks
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon
 
Have your cake and eat it too
Have your cake and eat it tooHave your cake and eat it too
Have your cake and eat it too
Gwen (Chen) Shapira
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 

What's hot (20)

The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Spark streaming and Kafka
Spark streaming and KafkaSpark streaming and Kafka
Spark streaming and Kafka
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
kafka
kafkakafka
kafka
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
 
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxIntroducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
 
Have your cake and eat it too
Have your cake and eat it tooHave your cake and eat it too
Have your cake and eat it too
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 

Similar to Streaming and Messaging

3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
振东 刘
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
NguyenChiHoangMinh
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
P. Taylor Goetz
 
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
Edward Burns
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
Levon Avakyan
 
DataConf.TW2018: Develop Kafka Streams Application on Your Laptop
DataConf.TW2018: Develop Kafka Streams Application on Your LaptopDataConf.TW2018: Develop Kafka Streams Application on Your Laptop
DataConf.TW2018: Develop Kafka Streams Application on Your Laptop
Yu-Jhe Li
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
Alexis Seigneurin
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Storm worker redesign
Storm worker redesignStorm worker redesign
Storm worker redesign
Roshan Naik
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
confluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
Stream processing in python with Apache Samza and Beam
Stream processing in python with Apache Samza and BeamStream processing in python with Apache Samza and Beam
Stream processing in python with Apache Samza and Beam
Hai Lu
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
OpenStack and Windows
OpenStack and WindowsOpenStack and Windows
OpenStack and Windows
Alessandro Pilotti
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
Joan Viladrosa Riera
 
Openstack Cactus Survey
Openstack Cactus SurveyOpenstack Cactus Survey
Openstack Cactus Survey
Pjack Chen
 

Similar to Streaming and Messaging (20)

3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
 
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
HTTP/2 Comes to Java: Servlet 4.0 and what it means for the Java/Jakarta EE e...
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 
DataConf.TW2018: Develop Kafka Streams Application on Your Laptop
DataConf.TW2018: Develop Kafka Streams Application on Your LaptopDataConf.TW2018: Develop Kafka Streams Application on Your Laptop
DataConf.TW2018: Develop Kafka Streams Application on Your Laptop
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Storm worker redesign
Storm worker redesignStorm worker redesign
Storm worker redesign
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Stream processing in python with Apache Samza and Beam
Stream processing in python with Apache Samza and BeamStream processing in python with Apache Samza and Beam
Stream processing in python with Apache Samza and Beam
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
OpenStack and Windows
OpenStack and WindowsOpenStack and Windows
OpenStack and Windows
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story[Big Data Spain] Apache Spark Streaming + Kafka 0.10:  an Integration Story
[Big Data Spain] Apache Spark Streaming + Kafka 0.10: an Integration Story
 
Openstack Cactus Survey
Openstack Cactus SurveyOpenstack Cactus Survey
Openstack Cactus Survey
 

Recently uploaded

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 

Streaming and Messaging

  • 1. Streaming & Messaging - Xin Wang July 2017, Beijing, Apache RocketMQ Meetup
  • 2. Xin Wang • Apache Storm Committer & PMC member • Five years distributed system exprience • Love open source • Focus on distributed technologies, especially stream processing • https://github.com/vesense
  • 3. Index Part-1: Distributed Computing and Storage Part-2: Streaming and Messaging Part-3: Streaming in User Behavior Analysis
  • 4. 01 Distributed Computing & Storage 「分而治之」
  • 9. What's new in Storm 1.0 • Nimbus HA - Failover • Pacemaker - Replaces Zookeeper for Heartbeats • Streaming Window API • Distributed Cache API - Allows sharing of files among topologies • State Management - Statefule Bolts with Automatic Checkpointing • Automatic Backpressure • Resource Aware Scheduler • Dynamic Log Levels • Dynamic Worker Profiling • Streaming SQL(since 1.1) • ......
  • 10. What's new in Storm 2.0 • Port Clojure to Java • Lambda Expression Support - bolt: `tuple -> System.out.println(tuple)` • Apache Beam Runner • Storm-SQL Improvements • Metrics V2 (or release in 1.2) • Worker/Threading model Redesign • Worker Classloader Isolation • Dynamic Topology Updates • ......
  • 12. 03 Streaming in User Behavior Analysis 「用之有方」
  • 14. User Behavior Analysis 2.0 3G/4G HBase StormKafka HDFS/Hive MySQL Solr WLAN Cable Topic-In-1 Topic-In-N Topic-Out-1 Topic-Out-N Topology-1 Topology-2 Topology-3 Topology-N Other
  • 15. User Behavior Analysis 3.0 - Neymar • Unified streaming API • Pluggable Tagging module • Dynamic Rule/Schema update • UDF support for Schema fields • Built-in main Serializer/Deserializer • 16 vcore+32GB RAM *12node • 60 billion/day, about 700K tps • No data lose
  • 16. Issues&Practices • STORM-643 KafkaUtils repeatedly fetches messages whose offset is out of range • STORM-329 Fix cascading Storm failure by improving reconnection strategy and buffering messages • STORM-763 Nimbus reassigned worker A to another machine, but other worker's netty client can't connect to the new worker A • Worker down because of some Kafka ZK related bugs KAFKA-1382,KAFKA-1387,KAFKA-1451,KAFKA- 1585 • Be careful incompatible changes when upgrading Storm from 0.9 to 1.0. Scheme changed from `deserialize(byte[] ser)` to `deserialize(ByteBuffer ser)` • Worker heavy GCs due to large in-memory cache • Put the lightweight logics into the same bolt • Do NOT set too many executors/tasks in one worker, higher parallelism sometimes causes lower throughput • MQ->Streaming->MQ is better than MQ->Streaming->HDFS(throughput, delay, stability) • Never log the logs unnecessay
  • 17. Future • Storm 2.0 & Kafka 0.11 • Apache Beam & Open-Messaging • Investigating new Streaming Engine: Apache Flink, Spark-Streaming • Investigating new MQ: Apache RocketMQ • ROCKETMQ-81: RocketMQ-Spark Integration • ROCKETMQ-82: RocketMQ-Flink Integration • STORM-2349/ROCKETMQ-79: RocketMQ-Storm Integration
  • 18. Storm-RocketMQ • RocketMQSpout - Now only RocketMQ push mode supported, pull mode is in the plan. The default Deserializer is StringScheme, you can override the value by setting `RocketMQConfig.SCHEME`. • RocketMQBolt - Async sending by default, or you can change the value by invoking `withAsync(boolean async)` • RocketMQState - For users using Storm Trident API • TopicSelector - Selecting a topic based on the input Storm tuple • TupleToMessageMapper - Mapping a Storm tuple to a RocketMQ message, you can implement the MessageBodySerializer interface to serialize the message body. The default implementation of MessageBodySerializer is `body.toString().getBytes(StandardCharsets.UTF_8)` • MessageRetryManager - Retry policy for failed messages More details please refer to: https://github.com/apache/storm/tree/master/external/storm-rocketmq