3.1.Performance and BigData Ecosystem

1.
Performance and BigData dongeforever@apache.org

2.
dongeforever Apache RocketMQ PMC/Committer. Interestedin algorithms and performance. Now spending more time in open source, including MQ and BigData

3.
Content Map ❖ Challengeof Big Data Stream ❖ Kafka Overview ❖ Batch Through ❖ Compression Through ❖ Structural Compression ❖ BigData Eco-system

4.
Challenge of BigData Stream

5.
Challenge of BigData Stream ❖ High Throughput —— Million TPS ❖ IO BandWidth —— Network & Disk ❖ Storage Cost

6.
Kafka Overview ❖ Opensourced in early 2011, graduate from the Apache Incubator on 23 October 2012 ❖ Log Aggregation, Messaging, Stream Processing, Event Sourcing ❖ Widely used in BigData processing, integrate with Storm, Spark, Flink, Samza, Hadoop, Flume, etc.

7.
Kafka Domain Model

8.
Kafka Store Structure

9.

10.
Batch Through

11.
Kafka Producer TPSvs Msg Size

12.
Proxy Producer TPSvs Msg Size

13.
RocketMQ Batch ❖ send(Collection<Message>msgs) ❖ Atomic ❖ Get 150w TPS ❖ https://rocketmq.apache.org/docs/batch-example/

14.

15.
Compression Through • Clientshandle the Compress/DeCompress • Only Need One Operation in Broker

16.
Compression Through • Forlog collection, about 5~10 compression ratio

17.

18.
Structural Compression

19.
Structural Compression

20.
Structural Compression ❖ Assumeeach msg has N bytes, each batch has B msgs ❖ Size of 0.10: (34 + N) * B ❖ Size of 0.11: 61 + (7 + N) * B ❖ For N <= 100, save storage upper to 20%~50%

21.

22.
External ecosystem

23.
External ecosystem https://github.com/apache/rocketmq-externals

3.1.Performance and BigData Ecosystem

More Related Content

What's hot

Similar to 3.1.Performance and BigData Ecosystem

Recently uploaded

3.1.Performance and BigData Ecosystem