SlideShare a Scribd company logo
1 of 36
Download to read offline
WAR STORIES: DIY
KAFKA
NINA HANZLIKOVA
16-10-2018
2
● Zalando is Europe’s largest
online fashion retailer.
● "Reimagine fashion for the
good of all."
● Radical Agility:
○ Autonomy
○ Mastery
○ Purpose
WHO ARE WE?
3
● Zalando Dublin is the
company’s Fashion Insights
Centre.
● We build data science
applications to help us
understand fashion.
● We use these findings to drive
our business.
WHAT DO WE DO?
● My team works to obtain and
analyze Fashion Data from the
web.
● We aim to provide this data in
near real time.
● There is quite a lot of it.
As developers we loved Apache Kafka® (and Kafka Streams) pretty much
straight away!
4
● Zalando teams have a high level of technological autonomy.
○ Teams can choose their own technology but they have to run it,
too.
● Zalando teams are usually small (in our case 3 developers) and are
highly focused on delivering customer value.
○ We really have to minimise the amount of time spent on Ops.
THERE’S A CATCH THOUGH
5
6
A BRIEF (LIES!) INTRO TO APACHE KAFKA®
Image courtesy of http://cloudurable.com/blog/kafka-architecture/index.html
7
AN EVEN BRIEFER (LIES!) INTRO TO KAFKA
STREAMS
Image courtesy of https://docs.confluent.io/current/streams/architecture.html
8
KAFKA DATA STORAGE
9
We run our services on AWS. So:
● Our producer and consumer
apps can shutdown uncleanly.
● So can our brokers.
● Network partitions are a thing.
● Network saturation is also a
thing.
AVOIDING DATA LOSS THE MINIMALIST WAY
We started with a basic setup:
● 3 ZooKeeper Nodes (t2.large)
● 6 Kafka Brokers (m3.xlarge) with
EBS volumes
● Kafka server.properties:
unclean.leader.election.enable=false
min.insync.replicas=2
default.replication.factor=3
● Producer config: acks=all
10
AND BASIC MONITORING
11
AND BASIC MONITORING
12
What happens if Kafka brokers can’t connect to ZooKeeper?
WHAT WENT WRONG?
● If they don’t need to access ZooKeeper they keep on working (even
for days).
● Eventually though they will need to move controller. Or a partition will
fall behind, becoming under replicated or going offline. Or a topic will
be created or deleted…
13
But connectivity will fix itself, right?
WHAT WENT WRONG?
Not exactly…
ZooKeeper client cached resolved hosts and did not re-resolve them
(PR150 and PR451).
On AWS, with ZooKeeper behind a load balancer, this can even cache a
load balancer instance rather than ZooKeeper itself.
This issue should be resolved in Kafka 2.0.0 [KAFKA-4041]
14
What about if a Kafka broker cannot properly communicate with
ZooKeeper?
WHAT WENT WRONG?
This happened sometimes when our broker temporarily partitioned and
reconnected with ZooKeeper.
Broker would end up caching an old zkVersion. This would then prevent
ISR-modifying operations from completing.
Good news though, this bug should be resolved from 1.1.0 on!
[KAFKA-2729]
15
● In most of these problems the
simplest and fastest solution
was to restart the broker.
● In our set up, this would
terminate the old broker and
EBS and bring up a new one.
● If the rest of the cluster is
healthy, the data will just
replicate onto the new broker.
FIXES, FIXES, FIXES
16
● Unfortunately, with a lot of data
in the cluster, the initial
replication can saturate the
network.
● Producers and Consumers
start timing out talking to Kafka
- effectively causing downtime.
● What happens if multiple
brokers lose connectivity? In
this case rolling restart is not
always an option.
NOT SO FAST
Graph courtesy of Michal Michalski
17
● We stopped relying on simple broker replication for persistence and
started persisting our EBS volumes as well!
● Most of the data is now persisted during termination. When a broker
restarts, only the messages written during its downtime need to be
replicated.
● If multiple brokers report problems, and rolling restart may not be
feasible, we can restart multiple brokers without losing all their data.
KAFKA CONFIGURATION MARK 2
18
KAFKA STREAM STORAGE
19
In a nutshell:
● By default Kafka Streams use
RocksDB for local storage.
● This storage can be quite large,
~200 MB per partition.
● RocksDB uses up a lot of
on-heap and off-heap memory.
KAFKA STREAM STORAGE
Basic setup:
● We used memory optimised
EC2 instance (m4) family,
keeping about half the memory
for off-heap usage.
● Instances had an EBS volume
attached for partition
information storage.
20
● If there was a single must-monitor metric for a Kafka Stream apps, it
was the consumer lag.
● We experimented with a number of lag monitors (Burrow, Kafka Lag
Monitor and Kafka Manager) but in the end started using a small utility,
built by our colleague Mark Kelly, called Remora.
THINGS TO KEEP AN EYE ON
21
As the load on our system
increased we started noticing
something odd.
Our stream app would run happily
for a few hours.
Then CPU and memory would spike
up, the system would grind to a halt,
and instance would crash.
RUNTIME PERFORMANCE MYSTERY
22
● We used EBS volumes to provide storage space for RocksDB.
● EBS volumes operate using I/O credits.
○ I/O credits are allocated based on the size of the disk.
○ As they get used up, I/O on the disk gets throttled.
○ These I/O credits eventually replenish over time.
● Under the hood our RocksDB was using up I/O credits faster than they
were replenishing.
● Increasing the size of the EBS volume also increased the number of
I/O credits.
REMEMBER EBS VOLUMES?
23
REMEMBER EBS VOLUMES?
24
WHEN CATASTROPHE STRIKES
25
● Kafka uses ZooKeeper to store some coordination information.
● This includes storing information about other brokers and the cluster
controller.
● Perhaps most importantly, it uses ZooKeeper to store topic partition
assignment mappings.
● These mappings tell Kafka brokers what data they actually store.
● ZooKeeper is a stateful service, and needs to be managed as such.
● If brokers need to be restarted, this needs to be a rolling restart.
LET’S TALK A LITTLE ABOUT ZOOKEEPER
26
Most of the services run by developers in Zalando are stateless, with a
backing store. Vast majority of docs on upgrades reflect this.
During one such upgrade a ZooKeeper cluster holding Kafka information
had all its instances restarted at once. This caused corruption of the
partition assignment mappings. As a result brokers no longer knew what
data they contained.
WHEN ZOOKEEPER STOPPED PLAYING NICE
27
Good News:
● The ZooKeeper appliance in
question ran under Exhibitor.
This is a popular supervisor
system for ZooKeeper, which
provides some backup and
restore capabilities.
● The Kafka cluster was also
being persisted by Secor.
ABOUT THOSE BACKUPS...
28
ABOUT THOSE BACKUPS...
Bad News:
● The Exhibitor backups are
intended for rolling back bad
transactions only. For this a user
has to index transaction logs. It is
also not intended for persisting
after teardown.
● Secor is really a last-resort
recovery solution, not a full
backup system. While the Secor
files were persisted, there was no
replay mechanism or procedure
for restoring from them.
29
LESSONS LEARNED
● Backups are only backups if you know how to restore them.
● Ensure that you understand what a service means when it talks about
backup and restore.
● Test that service provided backups work correctly.
● Regularly check restoring from your stored backups.
30
BACKUP REQUIREMENTS SUMMARY
● We needed to be able to persist data on broker disks for when the
cluster has lost connectivity.
● We don’t have to worry about Kafka Stream apps, since they use
Kafka topics to persist their data and use that to build up their
RocksDB on startup.
● We needed Kafka cluster data snapshots for when bad data is written,
topics are deleted, and other user errors.
● We needed ZooKeeper backups for when partition mapping
information is corrupted.
31
KAFKA BACKUPS
● Much like with Secor, we wanted a convenient way to store a Kafka
data snapshot in S3.
● However we also wanted a simple way to replay this data back into
Kafka.
● This is when we came across Kafka Connect. Kafka connect is a
convenient framework which enables transporting data between Kafka
and many other stores.
● Using the Spredfast S3 connector data can be easily backed up to a
bucket in S3 and later replayed out onto a new topic.
● We set up a daily cron job for this backup.
32
ZOOKEEPER BACKUPS
● Ideally we also wanted a daily snapshot of our ZooKeeper.
● After searching around a little we found Burry. Burry is a small backup
and recovery tool for ZooKeeper, etcd and Consul.
● It simply copies all ZooKeeper znode data to a file and stores it in a
specified location. This can be to local filesystem, S3, Google Cloud
Storage or others.
● Similarly it can be used to replay all this data to a new ZooKeeper
cluster. It will not overwrite existing data in the cluster.
● Likewise we set up a daily backup cron job for our ZooKeeper.
33
SOME FINAL THOUGHTS
34
SOME FINAL THOUGHTS
● There are lots of things one can monitor on their Kafka, but you don’t
need to be a Kafka wizard (just yet) to effectively understand your
cluster.
● It’s not quite enough to understand how Kafka and Kafka Streams
work. To be able to diagnose and remedy many issues a deeper
understanding of underlying components (such as EBS I/O credits) is
needed.
● In many cases backups don’t have to be overly sophisticated or hard
to implement, but they always need be replayable.
35
36
Nina Hanzlikova
@geekity2
https://github.com/geekity

More Related Content

What's hot

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulBetter Kafka Performance Without Changing Any Code | Simon Ritter, Azul
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020confluent
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...confluent
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
 
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka StreamsKafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streamsconfluent
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020confluent
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLBjoern Rost
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...confluent
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017confluent
 
Espresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom QuiggleEspresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom Quiggleconfluent
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafkaconfluent
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingGuozhang Wang
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Guozhang Wang
 
Ingesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah WhitacreIngesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah Whitacreconfluent
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configsconfluent
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019confluent
 

What's hot (20)

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulBetter Kafka Performance Without Changing Any Code | Simon Ritter, Azul
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
 
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka StreamsKafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017
 
Espresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom QuiggleEspresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom Quiggle
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
 
Ingesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah WhitacreIngesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah Whitacre
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configs
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
 

Similar to War Stories: DIY Kafka

War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafkaconfluent
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesRose Toomey
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Mich Talebzadeh (Ph.D.)
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Mich Talebzadeh (Ph.D.)
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Idi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknessesIdi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknessesLinuxaria.com
 
SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)Michael Pirker
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksRuslan Meshenberg
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16MLconf
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
EDB Postgres with Containers
EDB Postgres with ContainersEDB Postgres with Containers
EDB Postgres with ContainersEDB
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...Equnix Business Solutions
 
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...OpenNebula Project
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaDatabricks
 

Similar to War Stories: DIY Kafka (20)

War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Idi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknessesIdi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknesses
 
SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)SVC / Storwize analysis cost effective storage planning (use case)
SVC / Storwize analysis cost effective storage planning (use case)
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
EDB Postgres with Containers
EDB Postgres with ContainersEDB Postgres with Containers
EDB Postgres with Containers
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
 
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
 

More from confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

More from confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 

War Stories: DIY Kafka

  • 1. WAR STORIES: DIY KAFKA NINA HANZLIKOVA 16-10-2018
  • 2. 2 ● Zalando is Europe’s largest online fashion retailer. ● "Reimagine fashion for the good of all." ● Radical Agility: ○ Autonomy ○ Mastery ○ Purpose WHO ARE WE?
  • 3. 3 ● Zalando Dublin is the company’s Fashion Insights Centre. ● We build data science applications to help us understand fashion. ● We use these findings to drive our business. WHAT DO WE DO? ● My team works to obtain and analyze Fashion Data from the web. ● We aim to provide this data in near real time. ● There is quite a lot of it. As developers we loved Apache Kafka® (and Kafka Streams) pretty much straight away!
  • 4. 4 ● Zalando teams have a high level of technological autonomy. ○ Teams can choose their own technology but they have to run it, too. ● Zalando teams are usually small (in our case 3 developers) and are highly focused on delivering customer value. ○ We really have to minimise the amount of time spent on Ops. THERE’S A CATCH THOUGH
  • 5. 5
  • 6. 6 A BRIEF (LIES!) INTRO TO APACHE KAFKA® Image courtesy of http://cloudurable.com/blog/kafka-architecture/index.html
  • 7. 7 AN EVEN BRIEFER (LIES!) INTRO TO KAFKA STREAMS Image courtesy of https://docs.confluent.io/current/streams/architecture.html
  • 9. 9 We run our services on AWS. So: ● Our producer and consumer apps can shutdown uncleanly. ● So can our brokers. ● Network partitions are a thing. ● Network saturation is also a thing. AVOIDING DATA LOSS THE MINIMALIST WAY We started with a basic setup: ● 3 ZooKeeper Nodes (t2.large) ● 6 Kafka Brokers (m3.xlarge) with EBS volumes ● Kafka server.properties: unclean.leader.election.enable=false min.insync.replicas=2 default.replication.factor=3 ● Producer config: acks=all
  • 12. 12 What happens if Kafka brokers can’t connect to ZooKeeper? WHAT WENT WRONG? ● If they don’t need to access ZooKeeper they keep on working (even for days). ● Eventually though they will need to move controller. Or a partition will fall behind, becoming under replicated or going offline. Or a topic will be created or deleted…
  • 13. 13 But connectivity will fix itself, right? WHAT WENT WRONG? Not exactly… ZooKeeper client cached resolved hosts and did not re-resolve them (PR150 and PR451). On AWS, with ZooKeeper behind a load balancer, this can even cache a load balancer instance rather than ZooKeeper itself. This issue should be resolved in Kafka 2.0.0 [KAFKA-4041]
  • 14. 14 What about if a Kafka broker cannot properly communicate with ZooKeeper? WHAT WENT WRONG? This happened sometimes when our broker temporarily partitioned and reconnected with ZooKeeper. Broker would end up caching an old zkVersion. This would then prevent ISR-modifying operations from completing. Good news though, this bug should be resolved from 1.1.0 on! [KAFKA-2729]
  • 15. 15 ● In most of these problems the simplest and fastest solution was to restart the broker. ● In our set up, this would terminate the old broker and EBS and bring up a new one. ● If the rest of the cluster is healthy, the data will just replicate onto the new broker. FIXES, FIXES, FIXES
  • 16. 16 ● Unfortunately, with a lot of data in the cluster, the initial replication can saturate the network. ● Producers and Consumers start timing out talking to Kafka - effectively causing downtime. ● What happens if multiple brokers lose connectivity? In this case rolling restart is not always an option. NOT SO FAST Graph courtesy of Michal Michalski
  • 17. 17 ● We stopped relying on simple broker replication for persistence and started persisting our EBS volumes as well! ● Most of the data is now persisted during termination. When a broker restarts, only the messages written during its downtime need to be replicated. ● If multiple brokers report problems, and rolling restart may not be feasible, we can restart multiple brokers without losing all their data. KAFKA CONFIGURATION MARK 2
  • 19. 19 In a nutshell: ● By default Kafka Streams use RocksDB for local storage. ● This storage can be quite large, ~200 MB per partition. ● RocksDB uses up a lot of on-heap and off-heap memory. KAFKA STREAM STORAGE Basic setup: ● We used memory optimised EC2 instance (m4) family, keeping about half the memory for off-heap usage. ● Instances had an EBS volume attached for partition information storage.
  • 20. 20 ● If there was a single must-monitor metric for a Kafka Stream apps, it was the consumer lag. ● We experimented with a number of lag monitors (Burrow, Kafka Lag Monitor and Kafka Manager) but in the end started using a small utility, built by our colleague Mark Kelly, called Remora. THINGS TO KEEP AN EYE ON
  • 21. 21 As the load on our system increased we started noticing something odd. Our stream app would run happily for a few hours. Then CPU and memory would spike up, the system would grind to a halt, and instance would crash. RUNTIME PERFORMANCE MYSTERY
  • 22. 22 ● We used EBS volumes to provide storage space for RocksDB. ● EBS volumes operate using I/O credits. ○ I/O credits are allocated based on the size of the disk. ○ As they get used up, I/O on the disk gets throttled. ○ These I/O credits eventually replenish over time. ● Under the hood our RocksDB was using up I/O credits faster than they were replenishing. ● Increasing the size of the EBS volume also increased the number of I/O credits. REMEMBER EBS VOLUMES?
  • 25. 25 ● Kafka uses ZooKeeper to store some coordination information. ● This includes storing information about other brokers and the cluster controller. ● Perhaps most importantly, it uses ZooKeeper to store topic partition assignment mappings. ● These mappings tell Kafka brokers what data they actually store. ● ZooKeeper is a stateful service, and needs to be managed as such. ● If brokers need to be restarted, this needs to be a rolling restart. LET’S TALK A LITTLE ABOUT ZOOKEEPER
  • 26. 26 Most of the services run by developers in Zalando are stateless, with a backing store. Vast majority of docs on upgrades reflect this. During one such upgrade a ZooKeeper cluster holding Kafka information had all its instances restarted at once. This caused corruption of the partition assignment mappings. As a result brokers no longer knew what data they contained. WHEN ZOOKEEPER STOPPED PLAYING NICE
  • 27. 27 Good News: ● The ZooKeeper appliance in question ran under Exhibitor. This is a popular supervisor system for ZooKeeper, which provides some backup and restore capabilities. ● The Kafka cluster was also being persisted by Secor. ABOUT THOSE BACKUPS...
  • 28. 28 ABOUT THOSE BACKUPS... Bad News: ● The Exhibitor backups are intended for rolling back bad transactions only. For this a user has to index transaction logs. It is also not intended for persisting after teardown. ● Secor is really a last-resort recovery solution, not a full backup system. While the Secor files were persisted, there was no replay mechanism or procedure for restoring from them.
  • 29. 29 LESSONS LEARNED ● Backups are only backups if you know how to restore them. ● Ensure that you understand what a service means when it talks about backup and restore. ● Test that service provided backups work correctly. ● Regularly check restoring from your stored backups.
  • 30. 30 BACKUP REQUIREMENTS SUMMARY ● We needed to be able to persist data on broker disks for when the cluster has lost connectivity. ● We don’t have to worry about Kafka Stream apps, since they use Kafka topics to persist their data and use that to build up their RocksDB on startup. ● We needed Kafka cluster data snapshots for when bad data is written, topics are deleted, and other user errors. ● We needed ZooKeeper backups for when partition mapping information is corrupted.
  • 31. 31 KAFKA BACKUPS ● Much like with Secor, we wanted a convenient way to store a Kafka data snapshot in S3. ● However we also wanted a simple way to replay this data back into Kafka. ● This is when we came across Kafka Connect. Kafka connect is a convenient framework which enables transporting data between Kafka and many other stores. ● Using the Spredfast S3 connector data can be easily backed up to a bucket in S3 and later replayed out onto a new topic. ● We set up a daily cron job for this backup.
  • 32. 32 ZOOKEEPER BACKUPS ● Ideally we also wanted a daily snapshot of our ZooKeeper. ● After searching around a little we found Burry. Burry is a small backup and recovery tool for ZooKeeper, etcd and Consul. ● It simply copies all ZooKeeper znode data to a file and stores it in a specified location. This can be to local filesystem, S3, Google Cloud Storage or others. ● Similarly it can be used to replay all this data to a new ZooKeeper cluster. It will not overwrite existing data in the cluster. ● Likewise we set up a daily backup cron job for our ZooKeeper.
  • 34. 34 SOME FINAL THOUGHTS ● There are lots of things one can monitor on their Kafka, but you don’t need to be a Kafka wizard (just yet) to effectively understand your cluster. ● It’s not quite enough to understand how Kafka and Kafka Streams work. To be able to diagnose and remedy many issues a deeper understanding of underlying components (such as EBS I/O credits) is needed. ● In many cases backups don’t have to be overly sophisticated or hard to implement, but they always need be replayable.
  • 35. 35