SlideShare a Scribd company logo
1 of 65
SignalFx
SignalFx
Why and how we wrote a Kafka consumer
Rajiv Kurian, Software Engineer
rajiv@signalfx.com
@rzidane360
Agenda
1. Why we wrote a Kafka consumer
2. Properties and limitations of modern hardware
3. Optimizations
4. Results
SignalFx
Why we wrote a Kafka consumer
• High resolution:
• Any mix of resolutions up to 1 sec
• Streaming analytics:
• Custom analytics pipelines at any scale that output in seconds
• Streaming dashboards update in seconds
• Multidimensional metrics:
• Dimensions allow arbitrary modeling, pivoting, filtering, and
grouping of both raw and derived (from analytics) metrics
interactively on streaming data
• E.g. 99th-percentile-of-latency-by-service-by-customer
SignalFx is built for monitoring modern infrastructure
• Designed to replace SimpleConsumer not the 0.9
consumer
• Needed a non-blocking single threaded consumer
• Wanted it to be low over head
• 100s of thousands of messages/second
• Sensitive to GC
• The Kafka 0.9 consumer wasn’t ready yet
Why write a new Kafka consumer
SignalFx
Kafka consumer - a brief introduction
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Brokers, topics and partitions
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Metadata request and response
Client
Metadata Request
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Metadata request and response
Client
Metadata Request
Metadata Response
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Metadata request and response
Client
Metadata Request
Metadata Response
Partition Broker ID
0 1
1 2
…. ….
n 3
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Offset request and response
Client
Partition offset
0 9024
1 1245
…. ….
n 11645
Partition Broker ID
0 1
1 2
…. ….
n 3
Offsets
(Consumer group/ external source)
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Fetch request and response
Client
Fetch request
Partition offset
0 9024
1 1245
…. ….
n 11645
Partition Broker ID
0 1
1 2
…. ….
n 3
SignalFx
Topic : 0
Topic : 3
Topic : 6
Topic : 9
Topic : 2
Topic : 5
Topic : 8
Topic : 11
Topic : 1
Topic : 4
Topic : 7
Topic : 10
BROKER 1 BROKER 2 BROKER 3
Fetch request and response
Client
Fetch response
Partition offset
0 9026
1 1247
…. ….
n 11649
Partition Broker ID
0 1
1 2
…. ….
n 3
SignalFx
Properties and limitations
of modern hardware
SignalFx Main memory
L1 D L1 I
L3
L1 D L1 I
L2L2
Core 1 Core 2
1
Cache Lines
• Data is transferred between memory and cache in
blocks of fixed size, called cache lines (typically 64
bytes)
• The memory subsystem makes a few bets to help us:
• Temporal locality
• Spatial locality
• Prefetching
SignalFx Main memory
L1 D L1 I
L3
L1 D L1 I
L2L2
Core 1 Core 2
1
1
1
2
1
2
2
2
1 2
SignalFx
L1 D
Main memory
L1 D L1 I
L3
L1 I
L2L2
Core 1 Core 2
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
2
1 2 3 4 5 6 7 8
1
4
3
6 8
75
Reference latency numbers for comparison
By Jeff Dean: http://research.google.com/people/jeff/
L1 Cache 0.5ns
Branch mispredict 5 ns
L2 Cache 7 ns 14x L1 Cache
Mutex lock/unlock 25 ns
Main memory 100 ns 20x L2 Cache, 200x L1 Cache
Compress 1K bytes (Zippy) 3,000 ns
Send 1K bytes over 1Gbps 10,000 ns 0.01 ms
Read 4K randomly from SSD 150,000 ns 0.15 ms
Read 1MB sequentially from memory 250,000 ns 0.25 ms
Round trip within same DC 500,000 ns 0.5 ms
Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory
Disk seek 10,000,000 ns 10 ms 20x DC roundtrip
Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
SignalFx
L1 CORE
SignalFx
L2 CORE
SignalFx
Main
Memory
CORE
SignalFx
Optimizations
Optimization aims
• We are NOT aiming for more data/second
• Even a very inefficient implementation
will be bottlenecked by the network
• We are aiming to make the client get out of
the way
• The client is not the only thing running on
the system
• Leave all resources for the actual
application
Efficiency VS raw speed
• We value efficiency more than raw speed
for the client
• Fewer cycles
• Less cache usage and fewer cache
misses
• Less memory?
• Efficiency for the client == raw speed for
the application
Efficiency from constraints
• No consumer group functionality needed
• A single topic
• Finite number of integer partitions
• Partition reassignment is rare and happens
during startup and shutdown
• We are in control of the code that consumes
the messages
SignalFx
Use cache conscious data structures
Use arrays and open addressing hash maps
• Single topic. Less than 1024 partitions
• Instead of maps we can use arrays
• Or use primitive specialized open
addressing hash maps
Topic:Partition -> Offset
Topic:Partition offset
Foo:0 9026
Foo:1 1247
…. ….
Foo:n 11649
Partition offset
0 9026
1 1247
…. ….
n 11649
Foo Foo
Offsets
9026
1247
11649
offsetpartition
partition* offset*partition* offset*
Entry*
Entry*
Entry*
Entry*
1
2
3 4
Hash map implemented as an array of lists of key* |
value*
offsetpartition
partition* offset*partition* offset*
Dependable cache miss generator
List
List
List
List
Sparse array
offset 0
offset 1
offset 2
offset 3
offset 4
offset 5
offset 6
offset 7
offset 8
offset 9
offset 10
offset 11
offset 12
offset 13
1
SignalFx
In memory
1160
partition* offset*
Entry*
Entry*
Offsets
116
partition* offset*
0
116
116
Entry* Entry*
In cache
(4 * 2 + 4 + 8) + (4 + 4 + 8) + (4 + 8) + (8 + 8) = 64 bytes
1024 * 8 + 4 + 8 = 8204 bytes
4 * 64 = 256 bytes
1 * 64 = 64 bytes
1
2
3
4
1
Low memory and cache friendly data structures
• Queues built from integer arrays. Negative ->
partition lost
• Zero allocation hashed-wheel timer to close
stuck connections
• Open addressing hash maps
• BitSets coded on top of long arrays whenever a
set of partitions is required
• Can be traversed in O(num set bits)
Applicability and benefit to Kafka consumer 0.9
• Benefits - medium
• Lots of hash map look ups
• Applicability - low
• Multiple topics - sparse arrays not a great
match
• Open addressing hash maps - preserve
most of the benefits
SignalFx
Create buffers once, reuse
Eliminate redundant work
• A single topic. Finite number of partitions:
• Topic and client string immutable
• The metadata request buffer can be created just once and
kept around forever
• Other requests can have their fixed part written out and only
write the variable part on each request
• Offset request
= fixed_part + per_partition_part
• Fetch request create
= fixed_part + per_partition_part
SignalFx
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING REPLICA_ID
MAX_WAIT_TIME MIN_BYTES
NUM_TOPICS TOPIC_STRING
NUM_PARTITIONS
0 1266 1024
1 1164 1024
2 1900 1024
Fixed
Variable
FETCH REQUEST BUFFER
SignalFx
FETCH REQUEST BUFFER
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING REPLICA_ID
MAX_WAIT_TIME MIN_BYTES
NUM_TOPICS TOPIC_STRING
NUM_PARTITIONS
0 1266 1024
1 1164 1024
2 1900 1024
Index
1200
1216
1232
SignalFx
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING REPLICA_ID
MAX_WAIT_TIME MIN_BYTES
NUM_TOPICS TOPIC_STRING
NUM_PARTITIONS
Offsets
1289
1172
1990
0 1266 1024
1 1164 1024
2 1900 1024
Index
1200
1216
1232
FETCH REQUEST BUFFER
SignalFx
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING REPLICA_ID
MAX_WAIT_TIME MIN_BYTES
NUM_TOPICS TOPIC_STRING
NUM_PARTITIONS
Offsets
1289
1172
1990
0 1289 1024
1 1172 1024
2 1990 1024
Index
1200
1216
1232
FETCH REQUEST BUFFER
Code
private void setNewOffsetsForFetchRequest() {
final ByteBuffer buffer = this.fetchRequestBuffer;
// Iterate through the partitions assigned to this broker
// and write the offset directly on the buffer.
for (int i = 0; i < partitionAssignment.length; i++) {
// This loop runs in O(partitions assigned).
long bitSet = partitionAssignment[i];
while (bitSet != 0) {
final long t = bitSet & -bitSet;
final int partitionId = i * 64 + Long.bitCount(t - 1);
// The position in the buffer that points to the
// beginning of the offset for this partition.
final int bufferPositionForOffset = fetchRequestIndex[partitionId];
final long offset = partitionToOffset[partitionId];
// Write the offset directly.
buffer.putLong(bufferPositionForOffset, offset);
bitSet ^= t;
}
}
}
SignalFx
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING NUM_TOPICS
TOPIC_STRING
METADATA REQUEST BUFFER
Fixed
SignalFx
SIZE API_KEY
API_VERSION CORRELATION_ID
CLIEND_ID_STRING REPLICA_ID
NUM_TOPICS TOPIC_STRING
NUM_PARTITIONS
0 1 2 3 4 5
OFFSET REQUEST BUFFER
NUM_PARTITIONS_POSITION
Fixed
Applicability and benefit to Kafka consumer 0.9
• Benefits - high
• Reuse instead of allocating - temporal locality
• Steaming through 3 arrays - prefetching
• One fetch request per fetch response - common
• Metadata or offset requests - rare
• Applicability - high
• Internal detail so API doesn’t change
• Even for consumer groups, partition reassignment
and partition migration events are rare
SignalFx
Zero allocation response processing
Stream responses to application
• Pass each message to the application
when it is ready
• Consume messages synchronously
without a copy or allocation
• No deserialization required
• Benefits add up when processing 100s
of thousands of messages per second
Low level interface
public interface KafkaMessageHandler {
void handleMessage(ByteBuffer buffer, int position, int length);
}
public interface KafkaConsumer {
void poll(KafkaMessageHandler handler, long timeoutMs);
. . .
. . .
}
SignalFx
Partition Message 1 Message 2 Message .. Message n
1 … … … …
2 … … … …
3 … … … …
4 … … … …
Topic string, client string etc
FETCH RESPONSE PARSING
SignalFx
Partition Message 1 Message 2 Message .. Message n
1 … … … …
2 … … … …
3 … … … …
4 … … … …
Topic string, client string etc
FETCH RESPONSE PARSING
public interface KafkaMessageHandler {
void handleMessage(ByteBuffer buffer, int position, int length);
}
SignalFx
Partition Message 1 Message 2 Message .. Message n
1 … … … …
2 … … … …
3 … … … …
4 … … … …
Topic string, client string etc
FETCH RESPONSE PARSING
public interface KafkaMessageHandler {
void handleMessage(ByteBuffer buffer, int position, int length);
}
SignalFx
Partition Message 1 Message 2 Message .. Message n
1 … … … …
2 … … … …
3 … … … …
4 … … … …
Topic string, client string etc
FETCH RESPONSE PARSING
public interface KafkaMessageHandler {
void handleMessage(ByteBuffer buffer, int position, int length);
}
Applicability and benefit to Kafka consumer 0.9
• Benefits - very high
• Reuse response buffer, no allocations - temporal locality
• Data is processed right after being read from the socket -
temporal locality
• Streaming through a buffer - spatial locality + prefetching
• Combine with DirectByteBuffers for zero copy
• Applicability - low
• API too low level
• Integrity of internal buffers compromised by bugs in
application
• Maybe a low level “with great power comes great
responsibility” API
SignalFx
Some numbers
Caveats
• These are from running a very specific
workload similar to our application
• There are many Pareto-optimal choices
for a client. Our’s is not better in any
way - it’s just tuned for our workload
• It can and will prove bad for other
workloads
Benchmark
• Single topic-partition
• Settings of fetch_max_wait, fetch_min_bytes,
max_bytes_per_partition were identical
• Only 5000 messages per second produced by
a single producer
• Each message is 23 bytes
• Warm up -> profile for 5 mins
• 5000/sec * 5 mins = 1.5 million
• Profiler = Java Mission Control
SignalFx
0.9 Consumer allocation profile : TLAB
SignalFx
SignalFx Consumer allocation profile : TLAB
SignalFx
0.9 Consumer code profile
SignalFx
SignalFx Consumer code profile
SignalFx
With 5,000 messages/second
Implementation CPU Allocation TLAB
0.9 consumer 6% 422.8 MB
SignalFx consumer 1.3% 217 KB
4.6x 1944 x
SignalFx
With 10,000 messages/second
Implementation CPU Allocation TLAB
0.9 consumer 6.122% 858 MB
SignalFx consumer 1.456% 400 KB
4.2x 2145 x
SignalFx
Thank You!
Rajiv Kurian
rajiv@signalfx.com
@rzidane360
WE’RE HIRING
jobs@signalfx.com
@SignalFx - signalfx.com/careers
SignalFx
Q&A

More Related Content

What's hot

From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache KafkaKafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafkaconfluent
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshareAllen (Xiaozhong) Wang
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward
 
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxIntroducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxDatabricks
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...confluent
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for ScyllaScyllaDB
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configsconfluent
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Guozhang Wang
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Kafka Evaluation - High Throughout Message Queue
Kafka Evaluation - High Throughout Message QueueKafka Evaluation - High Throughout Message Queue
Kafka Evaluation - High Throughout Message QueueShafaq Abdullah
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 

What's hot (20)

From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase Update
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache KafkaKafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
 
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxIntroducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for Scylla
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configs
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Kafka Evaluation - High Throughout Message Queue
Kafka Evaluation - High Throughout Message QueueKafka Evaluation - High Throughout Message Queue
Kafka Evaluation - High Throughout Message Queue
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 

Viewers also liked

AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxSignalFx
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15SignalFx
 
SignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and AlertingSignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and AlertingSignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFxSignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemSignalFx
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...SignalFx
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into OverdriveTodd Palino
 
Tuning Kafka for Fun and Profit
Tuning Kafka for Fun and ProfitTuning Kafka for Fun and Profit
Tuning Kafka for Fun and ProfitTodd Palino
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxSignalFx
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesTodd Palino
 
Real-Time Fraud Detection with Storm and Kafka
Real-Time Fraud Detection with Storm and KafkaReal-Time Fraud Detection with Storm and Kafka
Real-Time Fraud Detection with Storm and KafkaAlexey Kharlamov
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Using Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsUsing Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsNVIDIA
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 

Viewers also liked (17)

AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
SignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and AlertingSignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and Alerting
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Tuning Kafka for Fun and Profit
Tuning Kafka for Fun and ProfitTuning Kafka for Fun and Profit
Tuning Kafka for Fun and Profit
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 
Real-Time Fraud Detection with Storm and Kafka
Real-Time Fraud Detection with Storm and KafkaReal-Time Fraud Detection with Storm and Kafka
Real-Time Fraud Detection with Storm and Kafka
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Using Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsUsing Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated Applications
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 

Similar to SignalFx Kafka Consumer Optimization

Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedGuozhang Wang
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...DefconRussia
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioOPNFV
 
Wireless Troubleshooting Tips using AirPcaps DFS Module Debugging
Wireless Troubleshooting Tips using AirPcaps DFS Module DebuggingWireless Troubleshooting Tips using AirPcaps DFS Module Debugging
Wireless Troubleshooting Tips using AirPcaps DFS Module DebuggingMegumi Takeshita
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUANatan Silnitsky
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language ClientSayyaparaju Sunil
 
Resolving Firebird performance problems
Resolving Firebird performance problemsResolving Firebird performance problems
Resolving Firebird performance problemsAlexey Kovyazin
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at XiaomiHBaseCon
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014Amazon Web Services
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPFIvan Babrou
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamStewart Needham
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiPROIDEA
 

Similar to SignalFx Kafka Consumer Optimization (20)

Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Network.pptx
Network.pptxNetwork.pptx
Network.pptx
 
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
 
Wireless Troubleshooting Tips using AirPcaps DFS Module Debugging
Wireless Troubleshooting Tips using AirPcaps DFS Module DebuggingWireless Troubleshooting Tips using AirPcaps DFS Module Debugging
Wireless Troubleshooting Tips using AirPcaps DFS Module Debugging
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Beijing - Ceph on All-Flash Storage - Breaking Performance Barriers
 
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
Resolving Firebird performance problems
Resolving Firebird performance problemsResolving Firebird performance problems
Resolving Firebird performance problems
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at Xiaomi
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
NFS and Oracle
NFS and OracleNFS and Oracle
NFS and Oracle
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart NeedhamThe post release technologies of Crysis 3 (Slides Only) - Stewart Needham
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof Dębski
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

SignalFx Kafka Consumer Optimization

  • 2. SignalFx Why and how we wrote a Kafka consumer Rajiv Kurian, Software Engineer rajiv@signalfx.com @rzidane360
  • 3. Agenda 1. Why we wrote a Kafka consumer 2. Properties and limitations of modern hardware 3. Optimizations 4. Results
  • 4. SignalFx Why we wrote a Kafka consumer
  • 5. • High resolution: • Any mix of resolutions up to 1 sec • Streaming analytics: • Custom analytics pipelines at any scale that output in seconds • Streaming dashboards update in seconds • Multidimensional metrics: • Dimensions allow arbitrary modeling, pivoting, filtering, and grouping of both raw and derived (from analytics) metrics interactively on streaming data • E.g. 99th-percentile-of-latency-by-service-by-customer SignalFx is built for monitoring modern infrastructure
  • 6. • Designed to replace SimpleConsumer not the 0.9 consumer • Needed a non-blocking single threaded consumer • Wanted it to be low over head • 100s of thousands of messages/second • Sensitive to GC • The Kafka 0.9 consumer wasn’t ready yet Why write a new Kafka consumer
  • 7. SignalFx Kafka consumer - a brief introduction
  • 8. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Brokers, topics and partitions
  • 9. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Metadata request and response Client Metadata Request
  • 10. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Metadata request and response Client Metadata Request Metadata Response
  • 11. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Metadata request and response Client Metadata Request Metadata Response Partition Broker ID 0 1 1 2 …. …. n 3
  • 12. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Offset request and response Client Partition offset 0 9024 1 1245 …. …. n 11645 Partition Broker ID 0 1 1 2 …. …. n 3 Offsets (Consumer group/ external source)
  • 13. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Fetch request and response Client Fetch request Partition offset 0 9024 1 1245 …. …. n 11645 Partition Broker ID 0 1 1 2 …. …. n 3
  • 14. SignalFx Topic : 0 Topic : 3 Topic : 6 Topic : 9 Topic : 2 Topic : 5 Topic : 8 Topic : 11 Topic : 1 Topic : 4 Topic : 7 Topic : 10 BROKER 1 BROKER 2 BROKER 3 Fetch request and response Client Fetch response Partition offset 0 9026 1 1247 …. …. n 11649 Partition Broker ID 0 1 1 2 …. …. n 3
  • 16. SignalFx Main memory L1 D L1 I L3 L1 D L1 I L2L2 Core 1 Core 2 1
  • 17. Cache Lines • Data is transferred between memory and cache in blocks of fixed size, called cache lines (typically 64 bytes) • The memory subsystem makes a few bets to help us: • Temporal locality • Spatial locality • Prefetching
  • 18. SignalFx Main memory L1 D L1 I L3 L1 D L1 I L2L2 Core 1 Core 2 1 1 1 2 1 2 2 2 1 2
  • 19. SignalFx L1 D Main memory L1 D L1 I L3 L1 I L2L2 Core 1 Core 2 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 1 4 3 6 8 75
  • 20. Reference latency numbers for comparison By Jeff Dean: http://research.google.com/people/jeff/ L1 Cache 0.5ns Branch mispredict 5 ns L2 Cache 7 ns 14x L1 Cache Mutex lock/unlock 25 ns Main memory 100 ns 20x L2 Cache, 200x L1 Cache Compress 1K bytes (Zippy) 3,000 ns Send 1K bytes over 1Gbps 10,000 ns 0.01 ms Read 4K randomly from SSD 150,000 ns 0.15 ms Read 1MB sequentially from memory 250,000 ns 0.25 ms Round trip within same DC 500,000 ns 0.5 ms Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory Disk seek 10,000,000 ns 10 ms 20x DC roundtrip Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
  • 25. Optimization aims • We are NOT aiming for more data/second • Even a very inefficient implementation will be bottlenecked by the network • We are aiming to make the client get out of the way • The client is not the only thing running on the system • Leave all resources for the actual application
  • 26. Efficiency VS raw speed • We value efficiency more than raw speed for the client • Fewer cycles • Less cache usage and fewer cache misses • Less memory? • Efficiency for the client == raw speed for the application
  • 27. Efficiency from constraints • No consumer group functionality needed • A single topic • Finite number of integer partitions • Partition reassignment is rare and happens during startup and shutdown • We are in control of the code that consumes the messages
  • 28. SignalFx Use cache conscious data structures
  • 29. Use arrays and open addressing hash maps • Single topic. Less than 1024 partitions • Instead of maps we can use arrays • Or use primitive specialized open addressing hash maps
  • 30. Topic:Partition -> Offset Topic:Partition offset Foo:0 9026 Foo:1 1247 …. …. Foo:n 11649 Partition offset 0 9026 1 1247 …. …. n 11649 Foo Foo Offsets 9026 1247 11649
  • 31. offsetpartition partition* offset*partition* offset* Entry* Entry* Entry* Entry* 1 2 3 4 Hash map implemented as an array of lists of key* | value*
  • 32. offsetpartition partition* offset*partition* offset* Dependable cache miss generator List List List List
  • 33. Sparse array offset 0 offset 1 offset 2 offset 3 offset 4 offset 5 offset 6 offset 7 offset 8 offset 9 offset 10 offset 11 offset 12 offset 13 1
  • 34. SignalFx In memory 1160 partition* offset* Entry* Entry* Offsets 116 partition* offset* 0 116 116 Entry* Entry* In cache (4 * 2 + 4 + 8) + (4 + 4 + 8) + (4 + 8) + (8 + 8) = 64 bytes 1024 * 8 + 4 + 8 = 8204 bytes 4 * 64 = 256 bytes 1 * 64 = 64 bytes 1 2 3 4 1
  • 35. Low memory and cache friendly data structures • Queues built from integer arrays. Negative -> partition lost • Zero allocation hashed-wheel timer to close stuck connections • Open addressing hash maps • BitSets coded on top of long arrays whenever a set of partitions is required • Can be traversed in O(num set bits)
  • 36. Applicability and benefit to Kafka consumer 0.9 • Benefits - medium • Lots of hash map look ups • Applicability - low • Multiple topics - sparse arrays not a great match • Open addressing hash maps - preserve most of the benefits
  • 38. Eliminate redundant work • A single topic. Finite number of partitions: • Topic and client string immutable • The metadata request buffer can be created just once and kept around forever • Other requests can have their fixed part written out and only write the variable part on each request • Offset request = fixed_part + per_partition_part • Fetch request create = fixed_part + per_partition_part
  • 39. SignalFx SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING REPLICA_ID MAX_WAIT_TIME MIN_BYTES NUM_TOPICS TOPIC_STRING NUM_PARTITIONS 0 1266 1024 1 1164 1024 2 1900 1024 Fixed Variable FETCH REQUEST BUFFER
  • 40. SignalFx FETCH REQUEST BUFFER SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING REPLICA_ID MAX_WAIT_TIME MIN_BYTES NUM_TOPICS TOPIC_STRING NUM_PARTITIONS 0 1266 1024 1 1164 1024 2 1900 1024 Index 1200 1216 1232
  • 41. SignalFx SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING REPLICA_ID MAX_WAIT_TIME MIN_BYTES NUM_TOPICS TOPIC_STRING NUM_PARTITIONS Offsets 1289 1172 1990 0 1266 1024 1 1164 1024 2 1900 1024 Index 1200 1216 1232 FETCH REQUEST BUFFER
  • 42. SignalFx SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING REPLICA_ID MAX_WAIT_TIME MIN_BYTES NUM_TOPICS TOPIC_STRING NUM_PARTITIONS Offsets 1289 1172 1990 0 1289 1024 1 1172 1024 2 1990 1024 Index 1200 1216 1232 FETCH REQUEST BUFFER
  • 43. Code private void setNewOffsetsForFetchRequest() { final ByteBuffer buffer = this.fetchRequestBuffer; // Iterate through the partitions assigned to this broker // and write the offset directly on the buffer. for (int i = 0; i < partitionAssignment.length; i++) { // This loop runs in O(partitions assigned). long bitSet = partitionAssignment[i]; while (bitSet != 0) { final long t = bitSet & -bitSet; final int partitionId = i * 64 + Long.bitCount(t - 1); // The position in the buffer that points to the // beginning of the offset for this partition. final int bufferPositionForOffset = fetchRequestIndex[partitionId]; final long offset = partitionToOffset[partitionId]; // Write the offset directly. buffer.putLong(bufferPositionForOffset, offset); bitSet ^= t; } } }
  • 44. SignalFx SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING NUM_TOPICS TOPIC_STRING METADATA REQUEST BUFFER Fixed
  • 45. SignalFx SIZE API_KEY API_VERSION CORRELATION_ID CLIEND_ID_STRING REPLICA_ID NUM_TOPICS TOPIC_STRING NUM_PARTITIONS 0 1 2 3 4 5 OFFSET REQUEST BUFFER NUM_PARTITIONS_POSITION Fixed
  • 46. Applicability and benefit to Kafka consumer 0.9 • Benefits - high • Reuse instead of allocating - temporal locality • Steaming through 3 arrays - prefetching • One fetch request per fetch response - common • Metadata or offset requests - rare • Applicability - high • Internal detail so API doesn’t change • Even for consumer groups, partition reassignment and partition migration events are rare
  • 48. Stream responses to application • Pass each message to the application when it is ready • Consume messages synchronously without a copy or allocation • No deserialization required • Benefits add up when processing 100s of thousands of messages per second
  • 49. Low level interface public interface KafkaMessageHandler { void handleMessage(ByteBuffer buffer, int position, int length); } public interface KafkaConsumer { void poll(KafkaMessageHandler handler, long timeoutMs); . . . . . . }
  • 50. SignalFx Partition Message 1 Message 2 Message .. Message n 1 … … … … 2 … … … … 3 … … … … 4 … … … … Topic string, client string etc FETCH RESPONSE PARSING
  • 51. SignalFx Partition Message 1 Message 2 Message .. Message n 1 … … … … 2 … … … … 3 … … … … 4 … … … … Topic string, client string etc FETCH RESPONSE PARSING public interface KafkaMessageHandler { void handleMessage(ByteBuffer buffer, int position, int length); }
  • 52. SignalFx Partition Message 1 Message 2 Message .. Message n 1 … … … … 2 … … … … 3 … … … … 4 … … … … Topic string, client string etc FETCH RESPONSE PARSING public interface KafkaMessageHandler { void handleMessage(ByteBuffer buffer, int position, int length); }
  • 53. SignalFx Partition Message 1 Message 2 Message .. Message n 1 … … … … 2 … … … … 3 … … … … 4 … … … … Topic string, client string etc FETCH RESPONSE PARSING public interface KafkaMessageHandler { void handleMessage(ByteBuffer buffer, int position, int length); }
  • 54. Applicability and benefit to Kafka consumer 0.9 • Benefits - very high • Reuse response buffer, no allocations - temporal locality • Data is processed right after being read from the socket - temporal locality • Streaming through a buffer - spatial locality + prefetching • Combine with DirectByteBuffers for zero copy • Applicability - low • API too low level • Integrity of internal buffers compromised by bugs in application • Maybe a low level “with great power comes great responsibility” API
  • 56. Caveats • These are from running a very specific workload similar to our application • There are many Pareto-optimal choices for a client. Our’s is not better in any way - it’s just tuned for our workload • It can and will prove bad for other workloads
  • 57. Benchmark • Single topic-partition • Settings of fetch_max_wait, fetch_min_bytes, max_bytes_per_partition were identical • Only 5000 messages per second produced by a single producer • Each message is 23 bytes • Warm up -> profile for 5 mins • 5000/sec * 5 mins = 1.5 million • Profiler = Java Mission Control
  • 62. SignalFx With 5,000 messages/second Implementation CPU Allocation TLAB 0.9 consumer 6% 422.8 MB SignalFx consumer 1.3% 217 KB 4.6x 1944 x
  • 63. SignalFx With 10,000 messages/second Implementation CPU Allocation TLAB 0.9 consumer 6.122% 858 MB SignalFx consumer 1.456% 400 KB 4.2x 2145 x
  • 64. SignalFx Thank You! Rajiv Kurian rajiv@signalfx.com @rzidane360 WE’RE HIRING jobs@signalfx.com @SignalFx - signalfx.com/careers

Editor's Notes

  1. A Kafka cluster has multiple brokers. Each broker is a process of its own with an unique id. The unit of serializability in Kafka is a partition. Each partition has all its messages ordered. I like to think of a topic as a group of partitions. A partition has a statically assigned leader. From the POV of regular clients all read/write operations must go through the leader.
  2. So a client needs to know the mapping of topic-partitions to brokers. This mapping can change dynamically. A client begins by sending a metadata request to know this mapping. A metadata request can be sent to any broker in the cluster.
  3. The broker then replies with a metadata response.
  4. So the client can now form a map of partitions to brokers.
  5. Next the client needs to build a table of partition -> next offset to consume. It can get it from the consumer group functionality or some other external source.
  6. Once this is built it can send fetch requests for actual data.
  7. As long as there is actual data to consume and no errors it gets back a fetch response.
  8. Data is transferred between memory and cache in blocks of fixed size, called cache lines (typically 64 bytes). If you need a single byte, 63 others are coming in for the ride and paying the full tax. So you might as well use these bytes. When the processor needs to read or write a location in main memory, it first checks for a corresponding entry in the cache. In the case of: 1. a cache hit, the processor immediately reads or writes the data in the cache line 2. a cache miss, the cache allocates a new entry and copies in data from main memory, then the request (read or write) is fulfilled from the contents of the cache
  9. Data is transferred between memory and cache in blocks of fixed size, called cache lines (typically 64 bytes). If you need a single byte, 63 others are coming in for the ride and paying the full tax. So you might as well use these bytes.
  10. An application summing numbers in nodes of a linked list might take one cache miss per node.
  11. Spatial locality and prefetching help a lot when summing an array on the other hand. The compiler is also able to write better vectorized code if your layout looks like this.
  12. We really really care about cache usage and cache misses. We don’t care about memory as much. So efficiency for the client means more resources for the application which means a faster application.
  13. Almost all our optimizations are based on constraints that come from our use of the consumer. So, many of them are not directly applicable to generic Kafka clients which need to work well under various scenarios. We need no consumer group functionality. We manage partitions and offsets outside of Kafka. This makes our client super simple. A single topic. Our applications mostly consume a topic. We have a finite small number of partitions. Usually <= 1024. Partition reassignment is rare. I would imagine that this is true for most applications. Control of the entire pipeline means we can make some assumptions that a generic client cannot. End to end principle.
  14. Now the interesting part.
  15. Since we have a single topic, all partitions implicitly belong to that topic. So we don’t need a concept of topic-partition. We only have partitions. Since we don’t need topic-partition objects we can store all per partition data in arrays with the array index = partition number.
  16. It is important to acknowledge that this is a tradeoff. Like we said before we really care about cache space and cache misses. We are ready to trade off using extra memory to reduce our cache usage. Here is an example: Let’s imagine that we have a java util hash map of partition to offset. We’ve already shown that we can have multiple cache misses to do an offset get or put. Now let’s imagine that we have a single partition 0 with an offset 116, store in this map. How much memory does this use? We’ll be generous and assume that headers are only 8 bytes and references are only 4 bytes. So let’s assume that the entry array was preallocated for 2 entries. There is a 8 byte header, a 4 byte length and two 4 byte references. That’s 20 bytes. Similarly the actual entry is itself 16 bytes and the boxed long is 16 bytes and the boxed integer is 12 bytes. So in spite of all the references and indirection it only uses 256 bytes of memory. On the other hand let’s assume that our sparse array has been preallocated for 1024 partitions. So it has a 4 byte length, a 8 byte header and 1024 8 byte entries so a total of 8204 bytes which is around 8 KB. This is a lot more than 64 bytes and kind of wasteful. Now let’s look at how much cache is used by each solution. Each cache line is 64 bytes. So even if you want a single byte 63 unrelated bytes might come along for the ride. Now let’s look at the java hash map again. We first need to fetch the right entry - that’s one cache line. The other entry comes alone for the ride and possibly the length and header. So that’s 64 bytes already. Now the actual entry is on another cache line. That is another cache line used up. Now we need to look at the contents of the boxed partition. That’s another random memory location so a new cache line. Finally we fetch the offset itself and that’s another cache line. So it’s 4 cache lines and hence 256 bytes of cache used up through a simple get request. Now let’s look at the sparse offset array. We know where to fetch it from so with a single cache fetch we get the offset. It comes with potentially 7 other offsets none of which might be useful, but it’s still a single cache line. So we use only 64 bytes! This example is a bit counter-intuitive. It goes to show that a data structure using only 64 bytes of memory can actually use many more times that memory in cache and a data structure using 8 KB of memory might only use a single cache line. This is a bit like virtual memory vs physical memory. You can use a lot of virtual memory but use little physical memory and come out ahead. In our example physical memory is abundant (we have gigabytes of it). Cache memory is very limited. We only have around 32 KB of L1 cache for example so it’s much more precious that physical memory. This also shows how we are ready to make trade offs. Sparse arrays can take more memory but have a pretty guaranteed worse case cache usage and cache miss number.
  17. We talked about the main data structures. Our other data structures especially for our state machine implementation are all designed to be zero allocation in the steady state and very cache friendly. Even our hashed-wheel timer is made of primitive arrays with very few indirections.
  18. Since we service a single topic per client, we can stamp out the client id and topic id bits and never change them.
  19. Since this variable sized portion rarely changes, we can afford to create an index to it. So we have an index of a partition to it’s position within the fetch request ByteBuffer.
  20. So let’s imagine that we sent this particular fetch request with partitions 0, 1 and 2. We have a response and the offsets have been advanced as shown by the offsets table. Now to create the next fetch request, we just read the new offsets and use the index to write them directly on the old buffer.
  21. And we are ready to send this buffer. We avoided all the work required to create a buffer, write out the fixed size fields etc. It’s just writing a few integers to locations in memory.
  22. This is how the code looks. There is a bit of noise in the code because we are iterating a bit set representing the partition assignment. But otherwise the code is simple - fetch the position within our request buffer for this partition. Get the next offset to fetch. Write this offset at the right position.
  23. The metadata request is just frozen after consumer creation.
  24. For the offset request for example we can store a pointer to the num partitions part of the request. So when we need to send a new offset request we can directly seek there and write out the partition bits.
  25. We don’t use JSON, XML, Thrift, ProtocolBuffers etc for our messages. Our messages do not need to be deserialized before consumption. They can be consumed directly just like Kafka’s internal messages can be. There is no POJO created from a serialized message. Instead we can wrap the buffer in a flyweight and consume the fields of our messages by doing reads from the underlying buffer. So we don’t need any copies or any allocation for steady state processing.
  26. The interface however is low level. Any handler of a message is fed with a buffer and a position and length within that buffer that represents the message. We could also alternatively set a position and limit on the buffer and send it to the application. The poll call takes such a handler and feeds it with messages.
  27. So let’s imagine that this is a response from Kafka. There are a bunch of fixed size bits on the top that we can skip. The real payload is a message set per partition.
  28. We begin by going to the first message, ensuring there are no errors and then just passing the pointer and length to the handler. It synchronously consumes it making copies if necessary and then returns back to the parsing code.
  29. We then consume the second message.
  30. And the third and so on.
  31. Benefits are huge. Zero copy and zero allocation in the steady path. Since we are not creating a new ByteBuffer every time - DirectByteBuffers become viable. So we elide the copy involved in reading from the socket into HeapByteBuffers. Sadly the applicability of this optimization is low. We are in control of our buffers and their lifetime so it is easy for us to avoid a copy. It is perhaps possible to create a very low level api that is not the default. I’ve not had much luck pushing this agenda in the past :)
  32. The Kafka client allocates about 423 MB for 5000 * 300 = 1.5 million messages That’s 86.56% of all allocations. A size able portion of that is in fetch response parsing. A lot of that is ByteBuffer slicing which our client does not do at all. We talked about a possible but dangerous way to get rid of this entirely. 1.76% is in the selector. About 9.27% is in cluster init. I am not sure why that’s so much.
  33. We allocate 218 KB overall to process 5000 * 300 = 1.5 million messages. The consumer does no allocations of it’s own. There are allocations done by the java NIO stack but they don’t show up in the profile. Selectors allocate and we plan to use an allocation less Selector like the one the Netty project uses.
  34. CPU used was 6.6%. 91% of that was the 0.9 consumer so about 6% 12.% spent on check sum math. 67% on handling fetch responses - we talked about a way to make this very fast. Some 6% in metadata - not sure why
  35. CPU was 2.63%. The client uses about 50% of that so 1.31%. 16.67% of that is spent in the select call. So the client code accounts for 33.33% of 2.63 which is 0.88% CPU
  36. Similar story for 10000 messages/second 4x odd for CPU and a lot more for allocations.