SlideShare a Scribd company logo
101* ways to configure
Kafka - badly
Audun Fauchald Strand
Lead Developer Infrastructure
@audunstrand
bio: gof, mq, ejb,
mda, wli, bpel eda,
soa, ws*,esb, ddd
Henning Spjelkavik
Architect
@spjelkavik
bio: Skiinfo (Vail Resorts),
FINN.no
enjoys reading jstacks
agenda
introduction to kafka
kafka @ finn.no
101* mistakes
questions
“From a certain point onward
there is no longer any turning
back. That is the point that
must be reached.”
― Franz Kafka, The Trial
Top 5
1. no consideration of data on the
inside vs outside
2. schema not externally defined
3. same config for every
client/topic
4. 128 partitions as default config
5. running on 8 overloaded nodes
FINN.no
2nd largest website in norway
classified ads ( Ebay, Zillow in one)
60 millions pageviews a day
80 microservices
130 developers
1000 deploys to production a week
6 minutes from commit to deploy
(median)
#kafkasummit @spjelkavik @audunstrand
Schibsted Media Group
6800 people in 30 countries
FINN.no is a part of
kafka @ finn.no
kafka @finn.
no
architecture
use cases
tools
#kafkasummit @spjelkavik @audunstrand
in the beginning ...
Architecture governance board decided to use RabbitMQ as message queue.
Kafka was installed for a proof of concept, after developers spotted it januar 2013.
#kafkasummit @spjelkavik @audunstrand
2013 - POC
“High” volume
Stream of classified ads
Ad matching
Ad indexed
mod05
zk
kafka
mod07
zk
kafka
mod01
zk
kafka
mod03
zk
kafka
mod06
zk
kafka
mod08
zk
kafka
mod02
zk
kafka
mod04
zk
kafka
dc 1
dc 2
Version 0.8.1
4 partitions
common client
java library
thrift
#kafkasummit @spjelkavik @audunstrand
2014 - Adoption and
complaining
low volume/ high
reliability
Ad Insert
Product Orchestration
Payment
Build Pipeline
click streams
mod05
zk
kafka
mod07
zk
kafka
mod01
zk
kafka
mod03
zk
kafka
mod06
zk
kafka
mod08
zk
kafka
mod02
zk
kafka
mod04
zk
kafka
dc 1
dc 2
Version 0.8.1
4 partitions
experimenting
with
configuration
common java
library
#kafkasummit @spjelkavik @audunstrand
tooling
alerting
#kafkasummit @spjelkavik @audunstrand
2015 - Migration and
consolidation
“reliable messaging”
asynchronous
communication
between services
store and forward
zipkin
slack notifications
dc 1
dc 2
Version 0.8.2
5-20 partitions
multiple
configurations
broker05
zk
kafka
broker01
zk
kafka
broker03
zk
kafka
broker04
zk
kafka
broker02
zk
kafka
#kafkasummit @spjelkavik @audunstrand
tooling
Grafana dashboard visualizing jmx stats
kafka-manager
kafka-cat
#kafkasummit @spjelkavik @audunstrand
2016 - Confluent
zk04 zk
broker01
broker05
kafka
kafka
broker03
kafka
broker04
kafka
broker02
kafka
zk05 zk
zk02 zk zk03 zk
zk01 zk
platform
schema registry
data replication
kafka connect
kafka streams
101* mistakes
“God gives the
nuts, but he
does not crack
them.”
― Franz Kafka
Pattern
Language
why is it a mistake
what is the consequence
what is the correct solution
what has finn.no done
Top 5
1. no consideration of data on the
inside vs outside
2. schema not externally defined
3. same config for every
client/topic
4. 128 partitions as default config
5. running on 8 overloaded nodes
#kafkasummit @spjelkavik @audunstrand
mistake:
no consideration of data on
the inside vs outside
https://flic.kr/p/6MjhUR
#kafkasummit @spjelkavik @audunstrand
why is it a mistake
everything published on Kafka (0.8.2) is visible to any client that can access
#kafkasummit @spjelkavik @audunstrand
what is the consequence
direct reads across services/domains is quite normal in legacy and/or enterprise
systems
coupling makes it hard to make changes
unknown and unwanted coupling has a cost
Kafka had no security per topic - you must add that yourself
#kafkasummit @spjelkavik @audunstrand
what is the correct solution
Consider what is data on the inside, versus data on the outside
Convention for what is private data and what is public data
If you want to change your internal representation often, map it before publishing it
publicly (Anti corruption layer)
#kafkasummit @spjelkavik @audunstrand
what has finn.no done
Decided on a naming convention (i.e Public.xyzzy) for public topics
Communicates the intention (contract)
#kafkasummit @spjelkavik @audunstrand
mistake:
schema not externally
defined
#kafkasummit @spjelkavik @audunstrand
why is it a mistake
data and code needs separate versioning strategies
version should be part of the data
defining schema in a java library makes it more difficult to access data from non-
jvm languages
very little discoverability of data, people chose other means to get their data
difficult to create tools
#kafkasummit @spjelkavik @audunstrand
what is the consequence
development speed outside jvm has been slow
change of data needs coordinated deployment
no process for data versioning, like backwards compatibility checks
difficult to create tooling that needs to know data format, like data
lake and database sinks
#kafkasummit @spjelkavik @audunstrand
what is the correct solution
confluent.io platform has a separate schema registry
apache avro
multiple compatibility settings and evolutions strategies
connect
Take complexity out of the applications
#kafkasummit @spjelkavik @audunstrand
what has finn.no done
still using java library, with schemas in builders
confluent platform 2.0 is planned for the next step, not (just) kafka 0.9
#kafkasummit @spjelkavik @audunstrand
mistake:
running mixed load with a
single, default configuration
https://flic.kr/p/qbarDR
#kafkasummit @spjelkavik @audunstrand
why is it a mistake
Historically - One Big Database with Expensive License
Database world - OLTP and OLAP
Changed with Open Source software and Cloud
Tried to simplify the developer's day with a single config
Kafka supports very high throughput and highly reliable
#kafkasummit @spjelkavik @audunstrand
what is the consequence
Trade off between throughput and degree of reliability
With a single configuration - the last commit wins
Either high throughput, and risk of loss - or potentially too slow
#kafkasummit @spjelkavik @audunstrand
what is the correct solution
Understand your use cases and their needs!
Use proper pr topic configuration
Consider splitting / isolation
#kafkasummit @spjelkavik @audunstrand
Defaults that are quite reliable
Exposing configuration variables in the client
Ask the questions;
● at least once delivery
● ordering - if you partition, what must have strict ordering
● 99% delivery - is that good enough?
● what level of throughput is needed
what has finn.no done
#kafkasummit @spjelkavik @audunstrand
Configuration
Configuration for production
● Partitions
● Replicas (default.replication.factor)
● Minimum ISR (min.insync.replicas)
● Wait for acknowledge when producing messages (request.required.acks, block.on.buffer.full)
● Retries
● Leader election
Configuration for consumer
● Number of threads
● When to commit (autocommit.enable vs consumer.commitOffsets)
#kafkasummit @spjelkavik @audunstrand
Gwen Shapira recommends...
● akcs = all
● block.on.buffer.full = true
● retries = MAX_INT
● max.inflight.requests.per.connect = 1
● Producer.close()
● replication-factor >= 3
● min.insync.replicas = 2
● unclean.leader.election = false
● auto.offset.commit = false
● commit after processing
● monitor!
#kafkasummit @spjelkavik @audunstrand
mistake:
default configuration of 128 partitions
for each topic
https://flic.kr/p/6KxPgZ
#kafkasummit @spjelkavik @audunstrand
why is it a mistake
partitions are kafkas way of scaling consumers, 128 partitions can handle 128
consumer processes
in 0.8; clusters could not reduce the number of partitions without deleting data
highest number of consumers today is 20
#kafkasummit @spjelkavik @audunstrand
what is the consequence
our 0.8 cluster was configured with 128 partitions as default, for all topics.
many partitions and many topics creates many datapoints that must be coordinated
zookeeper must coordinate all this
rebalance must balance all clients on all partitions
zookeeper and kafka went down (may 2015)
Users could note create ads for two days
#kafkasummit @spjelkavik @audunstrand
what is the correct solution
small number of partitions as default
increase number of partitions for selected topics
understand your use case (throughput target)
reduce length of transactions on consumer side
Max partitions on a broker => 1500 advised in our case - we had 38k
http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
#kafkasummit @spjelkavik @audunstrand
what has finn.no done
5 partitions as default
2 heavy-traffic topics have more than 5 partitions
#kafkasummit @spjelkavik @audunstrand
mistake:
deploy a proof of concept
hack - in production ; i.e
why we had 8 zk nodes
https://flic.kr/p/6eoSgT
#kafkasummit @spjelkavik @audunstrand
why is it a mistake
Kafka was set up by Ops for a proof of concept - not for hardened production use
By coincidence we had 8 nodes for kafka, the same 8 nodes for zookeeper
Zookeeper is dependent on a majority quorum, low latency between nodes
The 8 nodes were NOT dedicated - in fact - they were overloaded already
#kafkasummit @spjelkavik @audunstrand
what is the consequence
Zookeeper recommends 3 nodes for normal usage, 5 for high, and any more is
questionable
More nodes leads to longer time for finding consensus, more communication
If we get a split between data centers, there will be 4 in each
You should not run Zk between data centers, due to latency and outage
possibilities
#kafkasummit @spjelkavik @audunstrand
what is the correct solution
Have an odd number of Zookeeper nodes - preferrably 3, at most 5
Don’t cross data centers
Check the documentation before deploying serious production load
Don’t run a sensitive service (Zookeeper) on a server with 50 jvm-based services,
300% over committed on RAM
Watch GC times
#kafkasummit @spjelkavik @audunstrand
what has finn.no done
dc 1
dc 2
broker05
zk
kafka
broker01
zk
kafka
broker03
zk
kafka
broker04
zk
kafka
broker02
zk
kafka
Version 0.8.2
5-20 partitions
multiple
configurations
#kafkasummit @spjelkavik @audunstrand
“They say ignorance is
bliss.... they're wrong ”
― Franz Kafka
#kafkasummit @spjelkavik @audunstrand
References / Further reading
Designing data intensive systems, Martin Kleppmann
Data on the inside - data on the outside, Pat Helland
I Heart Logs, Jay Kreps
The Confluent Blog, http://confluent.io/
Kafka - The definitive guide
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
http://www.finn.no/apply-here
http://www.schibsted.com/en/Career/
“It's only because of
their stupidity that
they're able to be so
sure of themselves.”
― Franz Kafka, The
Trial
Audun Fauchald Strand
@audunstrand
Henning Spjelkavik
@spjelkavik
http://www.finn.no/apply-here
http://www.schibsted.com/en/Career/
Q?
#kafkasummit @spjelkavik @audunstrand
Runner up
Using pre-1.0 software
Have control of topic creation
Kafka is storage - treat it like one also ops-wise
Client side rebalancing, misunderstood
Commiting on all consumer threads, believing that you only commited on one

More Related Content

What's hot

Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
Todd Palino
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
confluent
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
Gwen (Chen) Shapira
 
Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtime
Guido Schmutz
 
Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015
Joel Koshy
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
confluent
 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badly
Henning Spjelkavik
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
confluent
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
Gwen (Chen) Shapira
 
Papers we love realtime at facebook
Papers we love   realtime at facebookPapers we love   realtime at facebook
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
David Groozman
 
How Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and StormHow Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and Storm
Edureka!
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
confluent
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Kafka aws
Kafka awsKafka aws
Kafka aws
Ariel Moskovich
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
DataWorks Summit/Hadoop Summit
 
Decoupling Decisions with Apache Kafka
Decoupling Decisions with Apache KafkaDecoupling Decisions with Apache Kafka
Decoupling Decisions with Apache Kafka
Grant Henke
 

What's hot (20)

Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
 
Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtime
 
Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015Kafkaesque days at linked in in 2015
Kafkaesque days at linked in in 2015
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badly
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Papers we love realtime at facebook
Papers we love   realtime at facebookPapers we love   realtime at facebook
Papers we love realtime at facebook
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
How Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and StormHow Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and Storm
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka aws
Kafka awsKafka aws
Kafka aws
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Decoupling Decisions with Apache Kafka
Decoupling Decisions with Apache KafkaDecoupling Decisions with Apache Kafka
Decoupling Decisions with Apache Kafka
 

Viewers also liked

Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Helena Edelson
 
More Datacenters, More Problems
More Datacenters, More ProblemsMore Datacenters, More Problems
More Datacenters, More Problems
Todd Palino
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Chris Fregly
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
Julian Hyde
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
confluent
 
101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)
Henning Spjelkavik
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
jimriecken
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
confluent
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
 
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...WSO2
 
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration PlatformSnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
SnapLogic
 
IPAAS_information on your terms
IPAAS_information on your termsIPAAS_information on your terms
IPAAS_information on your termsMarket Engel SAS
 
Anypoint mq (mulesoft) introduction
Anypoint mq (mulesoft)  introductionAnypoint mq (mulesoft)  introduction
Anypoint mq (mulesoft) introduction
Karthik Selvaraj
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic
 
Java Messaging Service
Java Messaging ServiceJava Messaging Service
Java Messaging Service
Dilip Prajapati
 
Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012
Charles Moulliard
 

Viewers also liked (20)

Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
 
More Datacenters, More Problems
More Datacenters, More ProblemsMore Datacenters, More Problems
More Datacenters, More Problems
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
 
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration PlatformSnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
SnapLogic Adds Support for Kafka and HDInsight to Elastic Integration Platform
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
 
IPAAS_information on your terms
IPAAS_information on your termsIPAAS_information on your terms
IPAAS_information on your terms
 
Anypoint mq (mulesoft) introduction
Anypoint mq (mulesoft)  introductionAnypoint mq (mulesoft)  introduction
Anypoint mq (mulesoft) introduction
 
IPaaS
IPaaSIPaaS
IPaaS
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
 
Java Messaging Service
Java Messaging ServiceJava Messaging Service
Java Messaging Service
 
Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012Cloud fuse-apachecon eu-2012
Cloud fuse-apachecon eu-2012
 

Similar to 101 ways to configure kafka - badly (Kafka Summit)

Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019
William Cox
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
confluent
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Spark Summit
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
DataStax Academy
 
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
Nicolas Fränkel
 
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesMulti-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
LINE Corporation
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
Data Con LA
 
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
StampedeCon
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
Paul Brebner
 
Apache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-PatternApache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-Pattern
confluent
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
jLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cachejLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cache
Nicolas Fränkel
 
VM Forking and Hypervisor-based fuzzing
VM Forking and Hypervisor-based fuzzingVM Forking and Hypervisor-based fuzzing
VM Forking and Hypervisor-based fuzzing
Tamas K Lengyel
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
DataStax
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
Jeff Henrikson
 

Similar to 101 ways to configure kafka - badly (Kafka Summit) (20)

Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019Dask and Machine Learning Models in Production - PyColorado 2019
Dask and Machine Learning Models in Production - PyColorado 2019
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
 
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesMulti-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
 
Arun
ArunArun
Arun
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
 
Apache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-PatternApache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-Pattern
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
jLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cachejLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cache
 
VM Forking and Hypervisor-based fuzzing
VM Forking and Hypervisor-based fuzzingVM Forking and Hypervisor-based fuzzing
VM Forking and Hypervisor-based fuzzing
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
 

More from Henning Spjelkavik

Hles 2021 Digital transformation - How to use digital tools to improve our ev...
Hles 2021 Digital transformation - How to use digital tools to improve our ev...Hles 2021 Digital transformation - How to use digital tools to improve our ev...
Hles 2021 Digital transformation - How to use digital tools to improve our ev...
Henning Spjelkavik
 
Digital techlunsj hos FINN.no 2020-06-10
Digital techlunsj hos FINN.no 2020-06-10Digital techlunsj hos FINN.no 2020-06-10
Digital techlunsj hos FINN.no 2020-06-10
Henning Spjelkavik
 
10 years of microservices at finn.no - why is that dragon still here (ndc o...
10 years of microservices at finn.no  - why is that dragon still here  (ndc o...10 years of microservices at finn.no  - why is that dragon still here  (ndc o...
10 years of microservices at finn.no - why is that dragon still here (ndc o...
Henning Spjelkavik
 
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
Henning Spjelkavik
 
An approach to it in a high level event - IOF HLES 2017
An  approach to it in a high level event - IOF HLES 2017An  approach to it in a high level event - IOF HLES 2017
An approach to it in a high level event - IOF HLES 2017
Henning Spjelkavik
 
Smidig 2016 - Er ledelse verdifullt likevel?
Smidig 2016 - Er ledelse verdifullt likevel?Smidig 2016 - Er ledelse verdifullt likevel?
Smidig 2016 - Er ledelse verdifullt likevel?
Henning Spjelkavik
 
Geomatikkdagene 2016 - Kart på FINN.no
Geomatikkdagene 2016 - Kart på FINN.noGeomatikkdagene 2016 - Kart på FINN.no
Geomatikkdagene 2016 - Kart på FINN.no
Henning Spjelkavik
 
IT for Event Directors
IT for Event DirectorsIT for Event Directors
IT for Event Directors
Henning Spjelkavik
 
Hvorfor vi bør brenne gammel management litteratur
Hvorfor vi bør brenne gammel management litteraturHvorfor vi bør brenne gammel management litteratur
Hvorfor vi bør brenne gammel management litteratur
Henning Spjelkavik
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
Henning Spjelkavik
 
HLES 2015 It in a high level event
HLES 2015 It in a high level eventHLES 2015 It in a high level event
HLES 2015 It in a high level event
Henning Spjelkavik
 
Strategisk design med "Impact Mapping"
Strategisk design med "Impact Mapping"Strategisk design med "Impact Mapping"
Strategisk design med "Impact Mapping"
Henning Spjelkavik
 
Smidig 2014 - Impact Mapping - Levér det som teller
Smidig 2014 - Impact Mapping - Levér det som tellerSmidig 2014 - Impact Mapping - Levér det som teller
Smidig 2014 - Impact Mapping - Levér det som teller
Henning Spjelkavik
 
Kart på FINN.no - Fra CGI til slippy map
Kart på FINN.no - Fra CGI til slippy mapKart på FINN.no - Fra CGI til slippy map
Kart på FINN.no - Fra CGI til slippy map
Henning Spjelkavik
 
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
Henning Spjelkavik
 
Misbruk av målstyring
Misbruk av målstyringMisbruk av målstyring
Misbruk av målstyring
Henning Spjelkavik
 
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastighetenJz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
Henning Spjelkavik
 
Fornebuløpet - Treningsprogram
Fornebuløpet - TreningsprogramFornebuløpet - Treningsprogram
Fornebuløpet - Treningsprogram
Henning Spjelkavik
 
Verdistrømanalyse Smidig 2009
Verdistrømanalyse   Smidig 2009Verdistrømanalyse   Smidig 2009
Verdistrømanalyse Smidig 2009
Henning Spjelkavik
 

More from Henning Spjelkavik (20)

Hles 2021 Digital transformation - How to use digital tools to improve our ev...
Hles 2021 Digital transformation - How to use digital tools to improve our ev...Hles 2021 Digital transformation - How to use digital tools to improve our ev...
Hles 2021 Digital transformation - How to use digital tools to improve our ev...
 
Digital techlunsj hos FINN.no 2020-06-10
Digital techlunsj hos FINN.no 2020-06-10Digital techlunsj hos FINN.no 2020-06-10
Digital techlunsj hos FINN.no 2020-06-10
 
10 years of microservices at finn.no - why is that dragon still here (ndc o...
10 years of microservices at finn.no  - why is that dragon still here  (ndc o...10 years of microservices at finn.no  - why is that dragon still here  (ndc o...
10 years of microservices at finn.no - why is that dragon still here (ndc o...
 
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
How FINN became somewhat search engine friendly @ Oslo SEO meetup 2018
 
An approach to it in a high level event - IOF HLES 2017
An  approach to it in a high level event - IOF HLES 2017An  approach to it in a high level event - IOF HLES 2017
An approach to it in a high level event - IOF HLES 2017
 
Smidig 2016 - Er ledelse verdifullt likevel?
Smidig 2016 - Er ledelse verdifullt likevel?Smidig 2016 - Er ledelse verdifullt likevel?
Smidig 2016 - Er ledelse verdifullt likevel?
 
Geomatikkdagene 2016 - Kart på FINN.no
Geomatikkdagene 2016 - Kart på FINN.noGeomatikkdagene 2016 - Kart på FINN.no
Geomatikkdagene 2016 - Kart på FINN.no
 
IT for Event Directors
IT for Event DirectorsIT for Event Directors
IT for Event Directors
 
Hvorfor vi bør brenne gammel management litteratur
Hvorfor vi bør brenne gammel management litteraturHvorfor vi bør brenne gammel management litteratur
Hvorfor vi bør brenne gammel management litteratur
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
 
HLES 2015 It in a high level event
HLES 2015 It in a high level eventHLES 2015 It in a high level event
HLES 2015 It in a high level event
 
Strategisk design med "Impact Mapping"
Strategisk design med "Impact Mapping"Strategisk design med "Impact Mapping"
Strategisk design med "Impact Mapping"
 
Smidig 2014 - Impact Mapping - Levér det som teller
Smidig 2014 - Impact Mapping - Levér det som tellerSmidig 2014 - Impact Mapping - Levér det som teller
Smidig 2014 - Impact Mapping - Levér det som teller
 
Kart på FINN.no - Fra CGI til slippy map
Kart på FINN.no - Fra CGI til slippy mapKart på FINN.no - Fra CGI til slippy map
Kart på FINN.no - Fra CGI til slippy map
 
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
Arena and TV-production - at IOF Open Technical Meeting in Lavarone 2014
 
Misbruk av målstyring
Misbruk av målstyringMisbruk av målstyring
Misbruk av målstyring
 
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastighetenJz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
Jz2010 Hvordan enkel analyse kan øke stabiliteten og hastigheten
 
Fornebuløpet - Brosjyre
Fornebuløpet - BrosjyreFornebuløpet - Brosjyre
Fornebuløpet - Brosjyre
 
Fornebuløpet - Treningsprogram
Fornebuløpet - TreningsprogramFornebuløpet - Treningsprogram
Fornebuløpet - Treningsprogram
 
Verdistrømanalyse Smidig 2009
Verdistrømanalyse   Smidig 2009Verdistrømanalyse   Smidig 2009
Verdistrømanalyse Smidig 2009
 

Recently uploaded

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 

101 ways to configure kafka - badly (Kafka Summit)

  • 1. 101* ways to configure Kafka - badly Audun Fauchald Strand Lead Developer Infrastructure @audunstrand bio: gof, mq, ejb, mda, wli, bpel eda, soa, ws*,esb, ddd Henning Spjelkavik Architect @spjelkavik bio: Skiinfo (Vail Resorts), FINN.no enjoys reading jstacks
  • 2. agenda introduction to kafka kafka @ finn.no 101* mistakes questions “From a certain point onward there is no longer any turning back. That is the point that must be reached.” ― Franz Kafka, The Trial
  • 3. Top 5 1. no consideration of data on the inside vs outside 2. schema not externally defined 3. same config for every client/topic 4. 128 partitions as default config 5. running on 8 overloaded nodes
  • 4. FINN.no 2nd largest website in norway classified ads ( Ebay, Zillow in one) 60 millions pageviews a day 80 microservices 130 developers 1000 deploys to production a week 6 minutes from commit to deploy (median)
  • 5. #kafkasummit @spjelkavik @audunstrand Schibsted Media Group 6800 people in 30 countries FINN.no is a part of
  • 8. #kafkasummit @spjelkavik @audunstrand in the beginning ... Architecture governance board decided to use RabbitMQ as message queue. Kafka was installed for a proof of concept, after developers spotted it januar 2013.
  • 9. #kafkasummit @spjelkavik @audunstrand 2013 - POC “High” volume Stream of classified ads Ad matching Ad indexed mod05 zk kafka mod07 zk kafka mod01 zk kafka mod03 zk kafka mod06 zk kafka mod08 zk kafka mod02 zk kafka mod04 zk kafka dc 1 dc 2 Version 0.8.1 4 partitions common client java library thrift
  • 10. #kafkasummit @spjelkavik @audunstrand 2014 - Adoption and complaining low volume/ high reliability Ad Insert Product Orchestration Payment Build Pipeline click streams mod05 zk kafka mod07 zk kafka mod01 zk kafka mod03 zk kafka mod06 zk kafka mod08 zk kafka mod02 zk kafka mod04 zk kafka dc 1 dc 2 Version 0.8.1 4 partitions experimenting with configuration common java library
  • 12. #kafkasummit @spjelkavik @audunstrand 2015 - Migration and consolidation “reliable messaging” asynchronous communication between services store and forward zipkin slack notifications dc 1 dc 2 Version 0.8.2 5-20 partitions multiple configurations broker05 zk kafka broker01 zk kafka broker03 zk kafka broker04 zk kafka broker02 zk kafka
  • 13. #kafkasummit @spjelkavik @audunstrand tooling Grafana dashboard visualizing jmx stats kafka-manager kafka-cat
  • 14. #kafkasummit @spjelkavik @audunstrand 2016 - Confluent zk04 zk broker01 broker05 kafka kafka broker03 kafka broker04 kafka broker02 kafka zk05 zk zk02 zk zk03 zk zk01 zk platform schema registry data replication kafka connect kafka streams
  • 15. 101* mistakes “God gives the nuts, but he does not crack them.” ― Franz Kafka
  • 16. Pattern Language why is it a mistake what is the consequence what is the correct solution what has finn.no done
  • 17. Top 5 1. no consideration of data on the inside vs outside 2. schema not externally defined 3. same config for every client/topic 4. 128 partitions as default config 5. running on 8 overloaded nodes
  • 18. #kafkasummit @spjelkavik @audunstrand mistake: no consideration of data on the inside vs outside https://flic.kr/p/6MjhUR
  • 19. #kafkasummit @spjelkavik @audunstrand why is it a mistake everything published on Kafka (0.8.2) is visible to any client that can access
  • 20. #kafkasummit @spjelkavik @audunstrand what is the consequence direct reads across services/domains is quite normal in legacy and/or enterprise systems coupling makes it hard to make changes unknown and unwanted coupling has a cost Kafka had no security per topic - you must add that yourself
  • 21. #kafkasummit @spjelkavik @audunstrand what is the correct solution Consider what is data on the inside, versus data on the outside Convention for what is private data and what is public data If you want to change your internal representation often, map it before publishing it publicly (Anti corruption layer)
  • 22. #kafkasummit @spjelkavik @audunstrand what has finn.no done Decided on a naming convention (i.e Public.xyzzy) for public topics Communicates the intention (contract)
  • 24. #kafkasummit @spjelkavik @audunstrand why is it a mistake data and code needs separate versioning strategies version should be part of the data defining schema in a java library makes it more difficult to access data from non- jvm languages very little discoverability of data, people chose other means to get their data difficult to create tools
  • 25. #kafkasummit @spjelkavik @audunstrand what is the consequence development speed outside jvm has been slow change of data needs coordinated deployment no process for data versioning, like backwards compatibility checks difficult to create tooling that needs to know data format, like data lake and database sinks
  • 26. #kafkasummit @spjelkavik @audunstrand what is the correct solution confluent.io platform has a separate schema registry apache avro multiple compatibility settings and evolutions strategies connect Take complexity out of the applications
  • 27. #kafkasummit @spjelkavik @audunstrand what has finn.no done still using java library, with schemas in builders confluent platform 2.0 is planned for the next step, not (just) kafka 0.9
  • 28. #kafkasummit @spjelkavik @audunstrand mistake: running mixed load with a single, default configuration https://flic.kr/p/qbarDR
  • 29. #kafkasummit @spjelkavik @audunstrand why is it a mistake Historically - One Big Database with Expensive License Database world - OLTP and OLAP Changed with Open Source software and Cloud Tried to simplify the developer's day with a single config Kafka supports very high throughput and highly reliable
  • 30. #kafkasummit @spjelkavik @audunstrand what is the consequence Trade off between throughput and degree of reliability With a single configuration - the last commit wins Either high throughput, and risk of loss - or potentially too slow
  • 31. #kafkasummit @spjelkavik @audunstrand what is the correct solution Understand your use cases and their needs! Use proper pr topic configuration Consider splitting / isolation
  • 32. #kafkasummit @spjelkavik @audunstrand Defaults that are quite reliable Exposing configuration variables in the client Ask the questions; ● at least once delivery ● ordering - if you partition, what must have strict ordering ● 99% delivery - is that good enough? ● what level of throughput is needed what has finn.no done
  • 33. #kafkasummit @spjelkavik @audunstrand Configuration Configuration for production ● Partitions ● Replicas (default.replication.factor) ● Minimum ISR (min.insync.replicas) ● Wait for acknowledge when producing messages (request.required.acks, block.on.buffer.full) ● Retries ● Leader election Configuration for consumer ● Number of threads ● When to commit (autocommit.enable vs consumer.commitOffsets)
  • 34. #kafkasummit @spjelkavik @audunstrand Gwen Shapira recommends... ● akcs = all ● block.on.buffer.full = true ● retries = MAX_INT ● max.inflight.requests.per.connect = 1 ● Producer.close() ● replication-factor >= 3 ● min.insync.replicas = 2 ● unclean.leader.election = false ● auto.offset.commit = false ● commit after processing ● monitor!
  • 35. #kafkasummit @spjelkavik @audunstrand mistake: default configuration of 128 partitions for each topic https://flic.kr/p/6KxPgZ
  • 36. #kafkasummit @spjelkavik @audunstrand why is it a mistake partitions are kafkas way of scaling consumers, 128 partitions can handle 128 consumer processes in 0.8; clusters could not reduce the number of partitions without deleting data highest number of consumers today is 20
  • 37. #kafkasummit @spjelkavik @audunstrand what is the consequence our 0.8 cluster was configured with 128 partitions as default, for all topics. many partitions and many topics creates many datapoints that must be coordinated zookeeper must coordinate all this rebalance must balance all clients on all partitions zookeeper and kafka went down (may 2015) Users could note create ads for two days
  • 38. #kafkasummit @spjelkavik @audunstrand what is the correct solution small number of partitions as default increase number of partitions for selected topics understand your use case (throughput target) reduce length of transactions on consumer side Max partitions on a broker => 1500 advised in our case - we had 38k http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
  • 39. #kafkasummit @spjelkavik @audunstrand what has finn.no done 5 partitions as default 2 heavy-traffic topics have more than 5 partitions
  • 40. #kafkasummit @spjelkavik @audunstrand mistake: deploy a proof of concept hack - in production ; i.e why we had 8 zk nodes https://flic.kr/p/6eoSgT
  • 41. #kafkasummit @spjelkavik @audunstrand why is it a mistake Kafka was set up by Ops for a proof of concept - not for hardened production use By coincidence we had 8 nodes for kafka, the same 8 nodes for zookeeper Zookeeper is dependent on a majority quorum, low latency between nodes The 8 nodes were NOT dedicated - in fact - they were overloaded already
  • 42. #kafkasummit @spjelkavik @audunstrand what is the consequence Zookeeper recommends 3 nodes for normal usage, 5 for high, and any more is questionable More nodes leads to longer time for finding consensus, more communication If we get a split between data centers, there will be 4 in each You should not run Zk between data centers, due to latency and outage possibilities
  • 43. #kafkasummit @spjelkavik @audunstrand what is the correct solution Have an odd number of Zookeeper nodes - preferrably 3, at most 5 Don’t cross data centers Check the documentation before deploying serious production load Don’t run a sensitive service (Zookeeper) on a server with 50 jvm-based services, 300% over committed on RAM Watch GC times
  • 44. #kafkasummit @spjelkavik @audunstrand what has finn.no done dc 1 dc 2 broker05 zk kafka broker01 zk kafka broker03 zk kafka broker04 zk kafka broker02 zk kafka Version 0.8.2 5-20 partitions multiple configurations
  • 45.
  • 46. #kafkasummit @spjelkavik @audunstrand “They say ignorance is bliss.... they're wrong ” ― Franz Kafka
  • 47. #kafkasummit @spjelkavik @audunstrand References / Further reading Designing data intensive systems, Martin Kleppmann Data on the inside - data on the outside, Pat Helland I Heart Logs, Jay Kreps The Confluent Blog, http://confluent.io/ Kafka - The definitive guide https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations http://www.finn.no/apply-here http://www.schibsted.com/en/Career/
  • 48. “It's only because of their stupidity that they're able to be so sure of themselves.” ― Franz Kafka, The Trial Audun Fauchald Strand @audunstrand Henning Spjelkavik @spjelkavik http://www.finn.no/apply-here http://www.schibsted.com/en/Career/ Q?
  • 49. #kafkasummit @spjelkavik @audunstrand Runner up Using pre-1.0 software Have control of topic creation Kafka is storage - treat it like one also ops-wise Client side rebalancing, misunderstood Commiting on all consumer threads, believing that you only commited on one