Oleksandr Nitavskyi "Kafka deployment at Scale"

Kafka deployment @
ScaleOleksandr Nitavskyi
Criteo

3 •
Criteo is a high performance online
advertising platform.
Our goal is to show the right campaign
to the right user at the right moment.

4 •
How does it work?
$1.12 $0.83 $0.74 $1.03
www
1 2
AD Server
3
Criteo
RTB
567
< 100 ms

5 •
All messages through the same pipeline
HDFS
Streaming
jobs
Paris

7 •
 Up to 10 millions msgs/s
 (500 billions msgs/day)
 180 TB / day (compressed)
 Around 200 brokers
 Kafka in production for 4 years
Some Figures

8 •
The Infrastructure
2
3
2
Multiple Datacenters
 Bare-metal servers
 Servers managed by Chef
 User applications running on
Mesos or YARN
 13 Kafka clusters

9 •
Partition size
• How we define the number of partitions?
• 1GB/partition/hour
• 72h retention
• We try to keep 72GB per partition
• No key, no problem when increasing partitions
• We have topics with more than 1.300 partitions

11 •
• First C# implementation
• Open Source (https://github.com/criteo/kafka-sharp/)
• Built for high-troughput
• Ability to blacklist partitions
• It discards messages if needed
• Our use:
• No Key Partitioning
• No order per partition guarantee
In-house C# Kafka Client

12 •
In-house C# Kafka Client
Broker 1
Broker 2
Broker 3
X
Slow
Application

13 •
Cons
• Costly to mantain
• Difficult to keep up to date
Pros
• Highly customizable
• Optimized for our use case
• Full control during the migration
Trade-off

15 •
• Lag is our main metric for the clients
• We should be able to measure the lag in all conditions:
• No messages sent
• Blacklisted partitions
• New partitions added
• Offline partitions
SLA based on lag

17 •
Watermarks
46
6
6
6
6
5
5
partition 1
partition 2
partition 4
• Special messages sent to each partition.
• They contain a monotonic timestamp.
• They provide a clock for the stream of messages.
• If a message has a timestamp lower than the previous timestamp, it’s late.
2
3
3
2
2
3
2
2
2
1
1
1
3
3
44
4
4
44
5 4
new old

18 •
{
"partition": 36,
"partition_count": 582,
"kafka_topic": "bid_request",
"timestamp": 1547394552,
"region": "par",
"cluster": "local",
"environment": "prod"
}
Watermark message

19 •
Consensus
2
6
4
4
5
4
1
4
2
2
3
1
partition 1
partition 2
partition 4
3
8
5
partition 312345
• Watermarks are not aligned across
partitions.
• (At Criteo) We can have unordered
messages per partition.
• Empty partitions contain only
watermarks.
• partition_time = max(watermark)
• Global time: min(partitions_time)

20 •
Watermark Injector
Consumer
Watermark Tracker
Map<partition, timestamp>
Online
Service
Watermark
Injector
Chronometer

22 •
• Producers/Consumers can overload the cluster.
• Overloaded brokers may lead to losing data.
• Front Cluster
• All data, that goes to HDFS
• Application clusters
• Streaming
Front and Application cluster
Front Application
Online
Service
HDFS

23 •
Mesos
• Kafka Connect application running on Mesos.
• Custom connector.
• Writes offsets on the destination.
• Replication inter or cross datacenters.
Replication

26 •
Replication
Replicator
worker 1
Replicator
worker 2

27 •
Watermark problem
P=1
P=2
P=3
P=4
Replicator
worker 1
Replicator
worker 2
P=2
P=4P=3
P=1

28 •
Watermark reinjection
Watermark
Reinjection
New consensus

Stream processing second and last try

display
Problems with Kappa Architecture
clicks 0
0
1
1
2
2
Job state
time

More generic problems Kappa and Kafka
partition 0
partition 1
partition 2
partition 3
partition 4

33 •
We are upgrading our C# Kafka client
We look forward for new features:
• Idempotent producers
• Transaction
• Headers
Kafka new features

34 •
• Challenges:
• More and more streaming use cases.
• Multiple frameworks: Flink, Kafka Connect, Kafka Stream, Plain
Consumers.
• Clients running on Mesos, Yarn or bare-metal
• Pulsar and Pravega evaluation
• We are working on a framework to help:
• Release
• Schedule
• Scale
• Monitor
• Maintain
Streaming

Oleksandr Nitavskyi "Kafka deployment at Scale"

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Oleksandr Nitavskyi "Kafka deployment at Scale"

Similar to Oleksandr Nitavskyi "Kafka deployment at Scale" (20)

More from Fwdays

More from Fwdays (20)

Recently uploaded

Recently uploaded (20)

Oleksandr Nitavskyi "Kafka deployment at Scale"

Editor's Notes