Multi tier, multi-tenant, multi-problem kafka

SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.
Multi-Tier, Multi-Tenant, Multi-Problem Kafka

Todd Palino

Who Am I?
3

What Will We Talk About?
 Multi-Tenant Pipelines
 Multi-Tier Architecture
 Why I Drink Interesting Problems
 Conclusion
4

Multi-Tenant Pipelines
5

Tracking and Data Deployment
 Tracking – Data going to HDFS
 Data Deployment – Hadoop job results going to online applications
 Many shared topics
 Schemas require a common header
 All message counts are audited
 Special Problems
– Hard to tell what application is dropping messages
– Some of these messages are copied 42 times!
6

Metrics
 Application and OS metrics
 Deployment and build system events
 Service calls – sampling of timing information for individual application calls
 Some application logs
– Every server in the datacenter produces to this cluster at least twice
– Graphing/Alerting system consumes the metrics 20 times
7

Logging
 Application logging messages destined for ELK clusters
 Lower retention than other clusters
 Loosest restrictions on message schema and encoding
– Not many – it’s still overprovisioned
– Customers starting to ask about aggregation
8

Queuing
 Everything else
 Primarily messages internal to applications
 Also emails and user messaging
 Messages are Avro encoded, but do not require headers
 Special Problems:
– Many messages which use unregistered schemas
– Clusters can have very high message rates (but not large data)
9

Special Case Clusters
 Not all use cases fit multi-tenancy
– Custom configurations that are needed
– Tighter performance guarantees
– Use of topic deletion
 Espresso (KV store) internal replication
 Brooklin – Change capture
 Replication from Hadoop to Voldemort
10

Tiered Cluster Architecture
11

One Kafka Cluster
12

Multiple Clusters – Message Aggregation
13

Why Not Direct?
 Network Concerns
– Bandwidth
– Network partitioning
– Latency
 Security Concerns
– Firewalls and ACLs
– Encrypting data in transit
 Resource Concerns
– A misbehaving application can swamp production resources
14

What Do We Lose?
 You may lose message ordering
– Mirror maker breaks apart message batches and redistributes them
 You may lose key to partition affinity
– Mirror maker will partition based on the key
– Differing partition counts in source and target will result in differing distribution
– Mirror maker does not (without work) honor custom partitioning
15

Aggregation Rules
 Aggregate clusters are only for consuming messages
– Producing to an aggregate cluster is not allowed
– This assures all aggregate clusters have the same content
 Not every topic appears in PROD aggregate-tracking clusters
– Trying to discourage aggregate cluster usage in PROD
– All topics are available in CORP
 Aggregate-queuing is whitelist only and very restricted
– Please discuss your use case with us before developing
16

Interesting Problems
17

Buy The Book!
18
Early Access available now.
Covers all aspects of Kafka,
from setup to client
development to ongoing
administration and
troubleshooting.
Also discusses stream
processing and other use
cases.

Monitoring Using Kafka
 Monitoring and alerting are self-service
– No gatekeeper on what metrics are collected and stored
 Applications use a common container
– EventBus Kafka producer
– Simple annotation of metrics to collect
– Sampled service calls
– Application logs
 Everything is produced to Kafka and consumed by the monitoring
infrastructure
19

Monitoring Kafka
 Kafka is great for monitoring your applications
20

KMon and EnlightIN
 Developed a separate monitoring and notification system
– Metrics are only retained long enough to alert on them
– One rule: we can’t use Kafka
 Alerting is simplified from our self-service system
– Nothing complex like regular expressions or RPNs
– Only used for critical Kafka and Zookeeper alerts
– Faster and more reliable
 Notifications are cleaner
– Alerts are grouped into incidents for fewer notifications when things break
– Notification system is generic and subscribable so we can use it for other things
21

Broker Monitoring
 Bytes In and Out, Messages In
– Why not messages out?
 Partitions
– Count and Leader Count
– Under Replicated and Offline
 Threads
– Network pool, Request pool
– Max Dirty Percent
 Requests
– Rates and times - total, queue, local, and send
22

Is Kafka Working?
 Knowing that the cluster is up isn’t always enough
– Network problems
– Metrics can lie
 Customers still ask us first if something breaks
– Part of the solution is educating them as to what to monitor
– Need to be absolutely sure of the answer “There’s nothing wrong with Kafka”
23

Kafka Monitoring Framework
 Producer to consumer testing of a Kafka cluster
– Assures that producers and consumers actually work
– Measures how long messages take to get through
 We have a SLO of 99.99% availability for all clusters
 Working on multi-tier support
– Answers the question of how long messages take to get to Hadoop
 LinkedIn Kafka Open Source
– https://github.com/linkedin/streaming
24

Is Mirroring Working?
 Most critical data flows through Kafka
– Most of that depends on mirror makers
– How do we make sure it all gets where it’s going?
 Mirror maker pipelines can have over a thousand topics
– Different message rates
– Some are more important than others
 Lag threshold monitoring doesn’t work
– Traffic spikes cause false alerts
– What should the threshold be?
– No easy way to monitor 1000 topics and over 10k partitions
25

Kafka Audit
 Audit tracks topic completeness across all clusters in the pipeline
– Primarily tracking messages
– Schema must have a valid header
– Alerts for DWH topics are set for 0.1% message loss
 Provided as an integrated part of the internal Kafka libraries
 Used for data completeness checks before Hadoop jobs run
26

Auditing Message Flows
27

Burrow
 Burrow is an advanced Kafka consumer monitoring system
– Provides an objective view of consumer status
– Much more powerful than threshold-based lag monitoring
 Burrow is Open Source!
– Used by many other companies, including Wikimedia and Blizzard
– Used internally to assure all Mirror Makers and Audit are running correctly
 Exports metrics for all consumers to self-service monitoring
 https://github.com/linkedin/Burrow
28

MTTF Is Not Your Friend
 We have over 1800 Kafka brokers
– All have at least 12 drives, most have 16
– Dual CPUs, at least 64 GB of memory
– Really lousy Megaraid controllers
 This means hardware fails daily
– We don’t always know when it happens, if it doesn’t take the system down
– It can’t always be fixed immediately
– We can take one broker down, but not two
29

Moving Partitions
 Prior to Kafka 0.8, moving partitions was basically impossible
– It’s still not easy – you have to be explicit about what you are moving
– There’s no good way to balance partitions in a cluster
 We developed kafka-assigner to solve the problem
– A single command to remove a broker and distribute it’s partitions
– Chainable modules for balancing partitions
– Open source! https://github.com/linkedin/kafka-tools
 Also working on “Cruise Control” for Kafka
– An add-on service that will handle redistributing partitions automatically
30

Pushing Data from Hadoop
 To help Hadoop jobs, we maintain a KafkaPushJob
– A mapper that produces messages to Kafka
– Pushes to data-deployment, which then gets mirrored to production
 Hadoop jobs tend to push a lot of data all at once
– Some jobs spin up hundreds of mappers
– Pushing many gigabytes of data in a very short period of time
 This overwhelms a Kafka cluster
– Spurious alerts for under replicated partitions
– Problems with mirroring the messages out
31

Kafka Quotas
 Quotas limit traffic based on client ID
– Specified in bytes/sec on a per-broker basis
– Not per-topic or per-partition
 Should be transparent to clients
– Accomplished by delaying the response to requests
– Newer clients have metrics specific to quotas for clarity
 We use it to protect the replication of the cluster
– Set it as high as possible while protecting against a single bad client
32

Delete Topic
 Feature has been under development for almost 3 years
– Only recently has it even worked a little bit
– We’re still not sure about it (from SRE’s point of view)
 Recently performed additional testing so we can use it
– Found that even when disabled for a cluster, something was happening
– Some brokers claimed the topic was gone, some didn’t
– Mirror makers broke for the topic
 One of the code paths in the controller was not blocked
– Metadata change went out, but it was hard to diagnose
33

Brokers are Independent
 When there’s a problem in the cluster, brokers might have bad information
– The controller should tell them what the topic metadata is
– Brokers get out of sync due to connection issues or bugs
 There’s no good tool for just sending a request to a broker and reading the
response
– We had to write a Java application just to send a metadata request
 Coming soon – kafka-protocol
– Simple CLI tool for sending individual requests to Kafka brokers
– Will be part of the https://github.com/linkedin/kafka-tools repository
34

Conclusion
35

Broker Improvement - JBOD
 We use RAID-10 on all brokers
– Trade off a lot of performance for a little resiliency
– Lose half of our disk space
 Current JBOD implementation isn’t great
– No admin tools for moving partitions
– Assignment is round-robin
– Broker shuts down if a single disk fails
 Looking at options
– Might try to fix the JBOD implementation in Kafka
– Testing running multiple brokers on a single server
36

Mirror Maker Improvements
 Mirror Maker has performance issues
– Has to decompress and recompress every message
– Loses information about partition affinity and strict ordering
 Developed an Identity message handler
– Messages in source partition 0 get produced directly to partition 0
– Requires mirror maker to maintain downstream partition counts
 Working on the next steps
– No decompression of message batches
– Looking at other options on how to run mirror makers
37

Administrative Improvements
 Multiple cluster management
– Topic management across clusters
– Visualization of mirror maker paths
 Better client monitoring
– Burrow for consumer monitoring
– No open source solution for producer monitoring (audit)
 End-to-end availability monitoring
38

Getting Involved With Kafka
 http://kafka.apache.org
 Join the mailing lists
– users@kafka.apache.org
– dev@kafka.apache.org
 irc.freenode.net - #apache-kafka
 Meetups
– Bay Area – https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/
 Contribute code
39

Multi tier, multi-tenant, multi-problem kafka

Multi tier, multi-tenant, multi-problem kafka

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Multi tier, multi-tenant, multi-problem kafka

Similar to Multi tier, multi-tenant, multi-problem kafka (20)

More from Todd Palino

More from Todd Palino (10)

Recently uploaded

Recently uploaded (20)

Multi tier, multi-tenant, multi-problem kafka

Editor's Notes