SlideShare a Scribd company logo
1 of 16
Download to read offline
Kafka Practical
Experience
RiCo Chen
Agenda
◎Kafka Overview
◎AFT Kafka Architecture
.Net Client -Pub/Sub
Key Terms Introduction
Message Delivery
◎Monitor Kafka Architecture
◎Kafka Performance Tuning
Producer
Broker
Consumer
JVM
Kafka Overview
• High-performance distributed streaming platform
• Popular project on githup: star=6389 , fork=3909
• Simple installation、Easy scale-out、More resource(Linux)
• High availability、High failover、High reliability、Auto
balance、Message persistence..
• Use Cases:Log Aggregation、Metrics、Event Sourcing、
Messaging…
• Power by: LinkedIn、airbnb、Mozilla、Twitter、LINE、
skyscanner、trivago、Hotel.com、PayPal、Uber、
Yahoo…
AFT Kafka Architecture
Kafka cluster(P2P)
broker1 broker2 broker3
zookeeper cluster(Master-Salve)
node1 node2 node3
UGS
API
UGS
WEB
UGS
WinSvr
Producers
Consumers
Serialization
publish message(Batch、
Fire and forget)
Middleware
Consumer
group
Logger
Services
Logger
Services
Logger
Services
deserialization
subscribe message
(message set、
async)
Topic’s configuration、broker status、
Cluster membership、controller process、
Coordinator process
Message(binary) queue、
partition、offset manager、
Leader cache、topic、
ReplicaManager、
GroupCoordinator(rebalance)
Replicat
e
Replicat
e
TCP
TCP
Heartbeat
.Net Client -Pub/Sub
Key Terms Introduction
• Broker: MQ process(Minimum unit in kafka cluster)
• Topic: Category of message(data is store in)
• Producer: push message to broker(write data)
• Consumer: pull message from broker(read data)
• ConsumerGroup: provide tolerance、 scalability、parallel for
Consumer
• Partition: provide tolerance、 scalability、parallel
• Offset: Message position on each partition
Message Delivery
At most once: Messages may be lost but are never redelivered
At least once: Messages are never lost but may be redelivered
Exactly once: Each message is delivered once and only
once(0.11.x)
Messages sent by a producer to a particular topic partition will
be appended in the order they are sent
A consumer instance sees records in the order they are stored in
the log.
Tolerate up to N-1 server failures.(depends replication factors)
Monitor Kafka Architecture
Telegraf
http
(every 10 sec)
Influxdb
2.Result
Grafana
JMX
1.QL
http
(every 10 sec)
Kafka-
Manager
Kafka
Eagle
Mysql
2.Result
TCP
1.Collec
t
2.Store
TCP
High level architecture blueprint
UGS
Platform
/Producer
Logger
Services
/Consumer
Channel
SQL
Server
Kafka
eagle
/Consumer
Kafka Performance Tuning
• Producer
• Broker
• Consumer
• JVM
Producer
• Load balancing(sends data directly to the broker that is the leader for the
partition)
• Acks=0 producer no wait any acknowledgment from the broker at all.
Lowest latency at the cost of durability but high data lost.
• Acks=1 producer gets an acknowledgment after the leader wrote the
record to its local log, but will respond without awaiting full
acknowledgement from all followers. Maybe follower will be lost data if
leader commit after.
• Acks=-1 producer gets all acknowledgment after all in-sync replicas has
received the data. Strong guarantee data not be lost.
• batch.size=100 ,net client
• send.buffer.bytes=100*1024
• producer.type=async
• compression.type=none
• max.in.flight.requests.per.connection=3
Note: min.insync.replicas>=2
ACKs Throughput Latency Durability
0 High Low No Gurantee
1 Medium Medium Leader
-1 Low High ISR
Broker
• More partition = more concurrent process = more memory =
more io access =increase throughput= increase latency
(brokers have to distribution on each partition) P.S single topic
less than 1024 partitions
• Number of Factors = two brokers at least
num.io.threads=8 num.network.threads=3 background.threads=10
queued.max.requests=500
socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
num.recovery.threads.per.data.dir=2
Log.retention.hours=24 log.flush.interval.messages=10000
log.flush.interval.ms=1000 log.cleanup.policy=delete log.cleaner.enable=true
log.cleaner.threads=1 log.cleaner.backoff.ms=30000
log.segment.bytes=1073741824 replica.fetch.min.bytes=1
replica.high.watermark.checkpoint.interval.ms=5000
replica.fetch.wait.max.ms=500
min.insync.replicas=2
Consumer
• Need enough partitions to handle message from producer
• Maximum number of consumer = a multiple of broker(balance
is better)
• max.poll.records=5000
• enable.auto.commit=true
• auto.commit.interval.ms=5000
• fetch.max.wait.ms=500
• fetch.min.bytes=1
• keep small Batch size in our .net client(for real time consumer
data)
JVM
• Avoid out of memory
• Avoid high frequency trigger GC
-Xmx8g –Xms8g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -
XX:MaxMetaspaceFreeRatio=80 -XX:MinMetaspaceFreeRatio=50 -
XX:G1HeapRegionSize=16M -XX:InitiatingHeapOccupancyPercent=35
-Xms: Set initial Java heap size
-Xmx: Set maximum Java heap size
+UseG1GC: Enable G1 GC
MaxGCPauseMillis: Set maximum pause
MaxMetaspaceFreeRatio: Set maximun metaspace free ratio
MinMetaspaceFreeRatio: Set minimun metaspace free ratio
G1HeapRegionSize: Adjust G1 region on each heap
InitiatingHeapOccupancyPercent: initial Java heap occupancy threshold
Q & A
Reference
• https://kafka.apache.org/
• https://github.com/apache/kafka
• http://www.oracle.com/technetwork/articles/java/g1gc-
1984535.html
• https://docs.oracle.com/cd/E40972_01/doc.70/e40973/cn
f_jvmgc.htm
• RiCo’s blog

More Related Content

What's hot

Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...InfluxData
 
Best Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise ClusterBest Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise ClusterInfluxData
 
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxData
 
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackIntroduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackAhmed AbouZaid
 
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...InfluxData
 
Why Architecting for Disaster Recovery is Important for Your Time Series Data...
Why Architecting for Disaster Recovery is Important for Your Time Series Data...Why Architecting for Disaster Recovery is Important for Your Time Series Data...
Why Architecting for Disaster Recovery is Important for Your Time Series Data...InfluxData
 
Robust Stream Processing With Apache Flink
Robust Stream Processing With Apache FlinkRobust Stream Processing With Apache Flink
Robust Stream Processing With Apache FlinkJamie Grier
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka ReplicatorMichael Hongliang Xu
 
InfluxDB & Kubernetes
InfluxDB & KubernetesInfluxDB & Kubernetes
InfluxDB & KubernetesInfluxData
 
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar AasenContainer Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar AasenInfluxData
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex applicationAkshay Gore
 
Intro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly DetectionIntro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly DetectionInfluxData
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick StackGianluca Arbezzano
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward
 
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...InfluxData
 
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringGianluca Arbezzano
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal
 

What's hot (20)

Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
 
Best Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise ClusterBest Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise Cluster
 
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
 
Tick
TickTick
Tick
 
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK StackIntroduction to InfluxDB and TICK Stack
Introduction to InfluxDB and TICK Stack
 
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
 
Why Architecting for Disaster Recovery is Important for Your Time Series Data...
Why Architecting for Disaster Recovery is Important for Your Time Series Data...Why Architecting for Disaster Recovery is Important for Your Time Series Data...
Why Architecting for Disaster Recovery is Important for Your Time Series Data...
 
Robust Stream Processing With Apache Flink
Robust Stream Processing With Apache FlinkRobust Stream Processing With Apache Flink
Robust Stream Processing With Apache Flink
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 
InfluxDB & Kubernetes
InfluxDB & KubernetesInfluxDB & Kubernetes
InfluxDB & Kubernetes
 
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar AasenContainer Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
Container Monitoring Best Practices Using AWS and InfluxData by Gunnar Aasen
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
 
Intro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly DetectionIntro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly Detection
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
 
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...
How Sysbee Manages Infrastructures and Provides Advanced Monitoring by Using ...
 
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 

Similar to Kafka practical experience

Architectures with Windows Azure
Architectures with Windows AzureArchitectures with Windows Azure
Architectures with Windows AzureDamir Dobric
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Peter Bakas
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stackNitin Mehta
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformMatteo Merli
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaLevon Avakyan
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafkaSamuel Kerrien
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarKarthik Ramasamy
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High AvailabilityJakub Pavlik
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HAtcp cloud
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overviewhowie YU
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Evan Chan
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 

Similar to Kafka practical experience (20)

Architectures with Windows Azure
Architectures with Windows AzureArchitectures with Windows Azure
Architectures with Windows Azure
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 

More from Rico Chen

SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdf
SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdfSQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdf
SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdfRico Chen
 
給開發人員的資料庫效能建議
給開發人員的資料庫效能建議給開發人員的資料庫效能建議
給開發人員的資料庫效能建議Rico Chen
 
SQL Server全集中實戰效能調校指引-第三章部分試讀
SQL Server全集中實戰效能調校指引-第三章部分試讀SQL Server全集中實戰效能調校指引-第三章部分試讀
SQL Server全集中實戰效能調校指引-第三章部分試讀Rico Chen
 
SQL Server全集中實戰效能調校指引-第二章部分試讀
SQL Server全集中實戰效能調校指引-第二章部分試讀SQL Server全集中實戰效能調校指引-第二章部分試讀
SQL Server全集中實戰效能調校指引-第二章部分試讀Rico Chen
 
SQL Server全集中實戰效能調校指引-第一章部分試讀
SQL Server全集中實戰效能調校指引-第一章部分試讀SQL Server全集中實戰效能調校指引-第一章部分試讀
SQL Server全集中實戰效能調校指引-第一章部分試讀Rico Chen
 
Fast build a recommendation system though sql server2017
Fast build a recommendation system though sql server2017Fast build a recommendation system though sql server2017
Fast build a recommendation system though sql server2017Rico Chen
 
大型Sql server zero down time 解決方案
大型Sql server zero down time 解決方案大型Sql server zero down time 解決方案
大型Sql server zero down time 解決方案Rico Chen
 
Sql2017 in memory oltp for developers
Sql2017 in memory oltp for developersSql2017 in memory oltp for developers
Sql2017 in memory oltp for developersRico Chen
 
Kafka cluster best practices
Kafka cluster best practicesKafka cluster best practices
Kafka cluster best practicesRico Chen
 
Automatic databasemigrationbyrico.chen
Automatic databasemigrationbyrico.chenAutomatic databasemigrationbyrico.chen
Automatic databasemigrationbyrico.chenRico Chen
 
Query store查詢調校新利器
Query store查詢調校新利器Query store查詢調校新利器
Query store查詢調校新利器Rico Chen
 
Sql server 2014 新功能探索
Sql server 2014  新功能探索Sql server 2014  新功能探索
Sql server 2014 新功能探索Rico Chen
 
進擊的Sql2016 in memory oltp rico
進擊的Sql2016 in memory oltp rico進擊的Sql2016 in memory oltp rico
進擊的Sql2016 in memory oltp ricoRico Chen
 
搶救資料庫效能大作戰
搶救資料庫效能大作戰搶救資料庫效能大作戰
搶救資料庫效能大作戰Rico Chen
 
查詢調校不求人
查詢調校不求人查詢調校不求人
查詢調校不求人Rico Chen
 
20120324 sql server 2012新特性by_rico
20120324 sql server 2012新特性by_rico20120324 sql server 2012新特性by_rico
20120324 sql server 2012新特性by_ricoRico Chen
 

More from Rico Chen (16)

SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdf
SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdfSQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdf
SQL-PASS-Summit-2023-善用SQLServer2022輕鬆完成應用需求Rico.pdf
 
給開發人員的資料庫效能建議
給開發人員的資料庫效能建議給開發人員的資料庫效能建議
給開發人員的資料庫效能建議
 
SQL Server全集中實戰效能調校指引-第三章部分試讀
SQL Server全集中實戰效能調校指引-第三章部分試讀SQL Server全集中實戰效能調校指引-第三章部分試讀
SQL Server全集中實戰效能調校指引-第三章部分試讀
 
SQL Server全集中實戰效能調校指引-第二章部分試讀
SQL Server全集中實戰效能調校指引-第二章部分試讀SQL Server全集中實戰效能調校指引-第二章部分試讀
SQL Server全集中實戰效能調校指引-第二章部分試讀
 
SQL Server全集中實戰效能調校指引-第一章部分試讀
SQL Server全集中實戰效能調校指引-第一章部分試讀SQL Server全集中實戰效能調校指引-第一章部分試讀
SQL Server全集中實戰效能調校指引-第一章部分試讀
 
Fast build a recommendation system though sql server2017
Fast build a recommendation system though sql server2017Fast build a recommendation system though sql server2017
Fast build a recommendation system though sql server2017
 
大型Sql server zero down time 解決方案
大型Sql server zero down time 解決方案大型Sql server zero down time 解決方案
大型Sql server zero down time 解決方案
 
Sql2017 in memory oltp for developers
Sql2017 in memory oltp for developersSql2017 in memory oltp for developers
Sql2017 in memory oltp for developers
 
Kafka cluster best practices
Kafka cluster best practicesKafka cluster best practices
Kafka cluster best practices
 
Automatic databasemigrationbyrico.chen
Automatic databasemigrationbyrico.chenAutomatic databasemigrationbyrico.chen
Automatic databasemigrationbyrico.chen
 
Query store查詢調校新利器
Query store查詢調校新利器Query store查詢調校新利器
Query store查詢調校新利器
 
Sql server 2014 新功能探索
Sql server 2014  新功能探索Sql server 2014  新功能探索
Sql server 2014 新功能探索
 
進擊的Sql2016 in memory oltp rico
進擊的Sql2016 in memory oltp rico進擊的Sql2016 in memory oltp rico
進擊的Sql2016 in memory oltp rico
 
搶救資料庫效能大作戰
搶救資料庫效能大作戰搶救資料庫效能大作戰
搶救資料庫效能大作戰
 
查詢調校不求人
查詢調校不求人查詢調校不求人
查詢調校不求人
 
20120324 sql server 2012新特性by_rico
20120324 sql server 2012新特性by_rico20120324 sql server 2012新特性by_rico
20120324 sql server 2012新特性by_rico
 

Recently uploaded

Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxNANDHAKUMARA10
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesRashidFaridChishti
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...ronahami
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 

Recently uploaded (20)

Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 

Kafka practical experience

  • 2. Agenda ◎Kafka Overview ◎AFT Kafka Architecture .Net Client -Pub/Sub Key Terms Introduction Message Delivery ◎Monitor Kafka Architecture ◎Kafka Performance Tuning Producer Broker Consumer JVM
  • 3. Kafka Overview • High-performance distributed streaming platform • Popular project on githup: star=6389 , fork=3909 • Simple installation、Easy scale-out、More resource(Linux) • High availability、High failover、High reliability、Auto balance、Message persistence.. • Use Cases:Log Aggregation、Metrics、Event Sourcing、 Messaging… • Power by: LinkedIn、airbnb、Mozilla、Twitter、LINE、 skyscanner、trivago、Hotel.com、PayPal、Uber、 Yahoo…
  • 4. AFT Kafka Architecture Kafka cluster(P2P) broker1 broker2 broker3 zookeeper cluster(Master-Salve) node1 node2 node3 UGS API UGS WEB UGS WinSvr Producers Consumers Serialization publish message(Batch、 Fire and forget) Middleware Consumer group Logger Services Logger Services Logger Services deserialization subscribe message (message set、 async) Topic’s configuration、broker status、 Cluster membership、controller process、 Coordinator process Message(binary) queue、 partition、offset manager、 Leader cache、topic、 ReplicaManager、 GroupCoordinator(rebalance) Replicat e Replicat e TCP TCP Heartbeat
  • 6. Key Terms Introduction • Broker: MQ process(Minimum unit in kafka cluster) • Topic: Category of message(data is store in) • Producer: push message to broker(write data) • Consumer: pull message from broker(read data) • ConsumerGroup: provide tolerance、 scalability、parallel for Consumer • Partition: provide tolerance、 scalability、parallel • Offset: Message position on each partition
  • 7. Message Delivery At most once: Messages may be lost but are never redelivered At least once: Messages are never lost but may be redelivered Exactly once: Each message is delivered once and only once(0.11.x) Messages sent by a producer to a particular topic partition will be appended in the order they are sent A consumer instance sees records in the order they are stored in the log. Tolerate up to N-1 server failures.(depends replication factors)
  • 8. Monitor Kafka Architecture Telegraf http (every 10 sec) Influxdb 2.Result Grafana JMX 1.QL http (every 10 sec) Kafka- Manager Kafka Eagle Mysql 2.Result TCP 1.Collec t 2.Store TCP
  • 9. High level architecture blueprint UGS Platform /Producer Logger Services /Consumer Channel SQL Server Kafka eagle /Consumer
  • 10. Kafka Performance Tuning • Producer • Broker • Consumer • JVM
  • 11. Producer • Load balancing(sends data directly to the broker that is the leader for the partition) • Acks=0 producer no wait any acknowledgment from the broker at all. Lowest latency at the cost of durability but high data lost. • Acks=1 producer gets an acknowledgment after the leader wrote the record to its local log, but will respond without awaiting full acknowledgement from all followers. Maybe follower will be lost data if leader commit after. • Acks=-1 producer gets all acknowledgment after all in-sync replicas has received the data. Strong guarantee data not be lost. • batch.size=100 ,net client • send.buffer.bytes=100*1024 • producer.type=async • compression.type=none • max.in.flight.requests.per.connection=3 Note: min.insync.replicas>=2 ACKs Throughput Latency Durability 0 High Low No Gurantee 1 Medium Medium Leader -1 Low High ISR
  • 12. Broker • More partition = more concurrent process = more memory = more io access =increase throughput= increase latency (brokers have to distribution on each partition) P.S single topic less than 1024 partitions • Number of Factors = two brokers at least num.io.threads=8 num.network.threads=3 background.threads=10 queued.max.requests=500 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 socket.send.buffer.bytes=102400 num.recovery.threads.per.data.dir=2 Log.retention.hours=24 log.flush.interval.messages=10000 log.flush.interval.ms=1000 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.threads=1 log.cleaner.backoff.ms=30000 log.segment.bytes=1073741824 replica.fetch.min.bytes=1 replica.high.watermark.checkpoint.interval.ms=5000 replica.fetch.wait.max.ms=500 min.insync.replicas=2
  • 13. Consumer • Need enough partitions to handle message from producer • Maximum number of consumer = a multiple of broker(balance is better) • max.poll.records=5000 • enable.auto.commit=true • auto.commit.interval.ms=5000 • fetch.max.wait.ms=500 • fetch.min.bytes=1 • keep small Batch size in our .net client(for real time consumer data)
  • 14. JVM • Avoid out of memory • Avoid high frequency trigger GC -Xmx8g –Xms8g -XX:MetaspaceSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 - XX:MaxMetaspaceFreeRatio=80 -XX:MinMetaspaceFreeRatio=50 - XX:G1HeapRegionSize=16M -XX:InitiatingHeapOccupancyPercent=35 -Xms: Set initial Java heap size -Xmx: Set maximum Java heap size +UseG1GC: Enable G1 GC MaxGCPauseMillis: Set maximum pause MaxMetaspaceFreeRatio: Set maximun metaspace free ratio MinMetaspaceFreeRatio: Set minimun metaspace free ratio G1HeapRegionSize: Adjust G1 region on each heap InitiatingHeapOccupancyPercent: initial Java heap occupancy threshold
  • 15. Q & A
  • 16. Reference • https://kafka.apache.org/ • https://github.com/apache/kafka • http://www.oracle.com/technetwork/articles/java/g1gc- 1984535.html • https://docs.oracle.com/cd/E40972_01/doc.70/e40973/cn f_jvmgc.htm • RiCo’s blog