The data team at Cloudflare uses Kafka to process tens of petabytes a day. All this data is moved using the 2 foundational Kafka api calls: Produce (api key 0) and Fetch (api key 1). Understanding the structure of these calls (and of the underlying RecordSet structure) is key to building high throughput clients.
The talk describes the basics of the Kafka wire protocol (api keys, correlation id), and the structure of the Produce and Fetch calls. It shows how the asynchronous nature of the wire protocol can combine with the structure of the Produce and Fetch calls to increase latency and reduce client throughput; a solution is offered through use of synchronous single-partition calls.
The RecordSet structure, which is used to encode and store sets (batches) of records is described, and its implications on Fetch requests are discussed. The relationship between Fetch api calls and ""consume"" operations is discussed, as is the impact of offset alignment to RecordSet boundaries.
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...HostedbyConfluent
You have learned about Kafka event sourcing with streams and using Kafka as a database, but you may be having a tough time wrapping your head around what that means and what challenges you will face. Kafka’s exactly once semantics, data retention rules, and stream DSL make it a great database for real-time transaction processing. This talk will focus on how to use Kafka events as a database. We will talk about using KTables vs GlobalKTables, and how to apply them to patterns we use with traditional databases. We will go over a real-world example of joining events against existing data and some issues to be aware of. We will finish covering some important things to remember about state stores, partitions, and streams to help you avoid problems when your data sets become large.
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...HostedbyConfluent
From migrations between Apache Kafka clusters to multi-region deployments across datacenters, the introduction of MirrorMaker2 has expanded the possibilities for Apache Kafka deployments and use cases. In this session you will learn about patterns, best practices, and learnings compiled from running MirrorMaker2 in production at every scale.
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillHostedbyConfluent
Here's the challenge: we've got a Kafka topic, where services publish messages to be delivered to browser-based clients through web sockets.
Sounds simple? It might, but we're faced with an increasing number of messages, as well as a growing count of web socket clients. How do we scale our solution? As our system contains a larger number of servers, failures become more frequent. How to ensure fault tolerance?
There’s a couple possible architectures. Each websocket node might consume all messages. Otherwise, we need an intermediary, which redistributes the messages to the proper web socket nodes.
Here, we might either use a Kafka topic, or a streaming forwarding service. However, we still need a feedback loop so that the intermediary knows where to distribute messages.
We’ll take a look at the strengths and weaknesses of each solution, as well as limitations created by the chosen technologies (Kafka and web sockets).
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, NutanixHostedbyConfluent
The business systems of an organization are a continuous source of events. Each system also needs to know about events happening in the other systems. Exchanging these events through direct API calls creates a web of inter-dependencies, is fragile and fails to scale. We examine how this problem can be solved through the use of right integration patterns implemented as a light-weight event hub that leverages the power of Kafka and Confluent to operate at enterprise scale. We demonstrate how JavaScript with its event-driven programming model can be a good fit for implementing an event hub that ensures guaranteed message delivery in the face of failures within the individual subscriber systems.
Many organizations having large engineering teams skilled in NodeJS and a multitude of NodeJs applications. We show how these teams can easily leverage the power of Kafka and scale their applications with the right architectural building blocks. We also offer insights from our own experience of building NodeJS based Kafka applications.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka. You can use different tools to deploy Kafka on Kubernetes - write your own YAML files, use Helm Charts, or go for one of the available operators. But there is one thing all of these have in common. You still need very good knowledge of Kubernetes to make sure your Kafka cluster works properly in all situations. This talk will cover different Kubernetes features such as resources, affinity, tolerations, pod disruption budgets, topology spread constraints and more. And it will explain why they are important for Apache Kafka and how to use them. If you are interested in running Kafka on Kubernetes and do not know all of these, this is a talk for you.
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentHostedbyConfluent
Consuming messages in parallel is what Apache Kafka® is all about, so you may well wonder, why would we want anything else? It turns out that, in practice, there are a number of situations where Kafka’s partition-level parallelism gets in the way of optimal design.
This session will go over some of these types of situations that can benefit from parallel message processing within a single application instance (aka slow consumers or competing consumers), and then introduce the new Parallel Consumer labs project from Confluent, which can improve functionality and massively improve performance in such situations.
It will cover the
- Different ordering modes of the client
- Relative performance improvements
- Usage with other components like Kafka Streams
- An introduction to the internal architecture of the project
- How it can achieve all this in a reassignment friendly manner
Watch this talk here: https://www.confluent.io/online-talks/how-apache-kafka-works-on-demand
Pick up best practices for developing applications that use Apache Kafka, beginning with a high level code overview for a basic producer and consumer. From there we’ll cover strategies for building powerful stream processing applications, including high availability through replication, data retention policies, producer design and producer guarantees.
We’ll delve into the details of delivery guarantees, including exactly-once semantics, partition strategies and consumer group rebalances. The talk will finish with a discussion of compacted topics, troubleshooting strategies and a security overview.
This session is part 3 of 4 in our Fundamentals for Apache Kafka series.
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...HostedbyConfluent
You have learned about Kafka event sourcing with streams and using Kafka as a database, but you may be having a tough time wrapping your head around what that means and what challenges you will face. Kafka’s exactly once semantics, data retention rules, and stream DSL make it a great database for real-time transaction processing. This talk will focus on how to use Kafka events as a database. We will talk about using KTables vs GlobalKTables, and how to apply them to patterns we use with traditional databases. We will go over a real-world example of joining events against existing data and some issues to be aware of. We will finish covering some important things to remember about state stores, partitions, and streams to help you avoid problems when your data sets become large.
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...HostedbyConfluent
From migrations between Apache Kafka clusters to multi-region deployments across datacenters, the introduction of MirrorMaker2 has expanded the possibilities for Apache Kafka deployments and use cases. In this session you will learn about patterns, best practices, and learnings compiled from running MirrorMaker2 in production at every scale.
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillHostedbyConfluent
Here's the challenge: we've got a Kafka topic, where services publish messages to be delivered to browser-based clients through web sockets.
Sounds simple? It might, but we're faced with an increasing number of messages, as well as a growing count of web socket clients. How do we scale our solution? As our system contains a larger number of servers, failures become more frequent. How to ensure fault tolerance?
There’s a couple possible architectures. Each websocket node might consume all messages. Otherwise, we need an intermediary, which redistributes the messages to the proper web socket nodes.
Here, we might either use a Kafka topic, or a streaming forwarding service. However, we still need a feedback loop so that the intermediary knows where to distribute messages.
We’ll take a look at the strengths and weaknesses of each solution, as well as limitations created by the chosen technologies (Kafka and web sockets).
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, NutanixHostedbyConfluent
The business systems of an organization are a continuous source of events. Each system also needs to know about events happening in the other systems. Exchanging these events through direct API calls creates a web of inter-dependencies, is fragile and fails to scale. We examine how this problem can be solved through the use of right integration patterns implemented as a light-weight event hub that leverages the power of Kafka and Confluent to operate at enterprise scale. We demonstrate how JavaScript with its event-driven programming model can be a good fit for implementing an event hub that ensures guaranteed message delivery in the face of failures within the individual subscriber systems.
Many organizations having large engineering teams skilled in NodeJS and a multitude of NodeJs applications. We show how these teams can easily leverage the power of Kafka and scale their applications with the right architectural building blocks. We also offer insights from our own experience of building NodeJS based Kafka applications.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka. You can use different tools to deploy Kafka on Kubernetes - write your own YAML files, use Helm Charts, or go for one of the available operators. But there is one thing all of these have in common. You still need very good knowledge of Kubernetes to make sure your Kafka cluster works properly in all situations. This talk will cover different Kubernetes features such as resources, affinity, tolerations, pod disruption budgets, topology spread constraints and more. And it will explain why they are important for Apache Kafka and how to use them. If you are interested in running Kafka on Kubernetes and do not know all of these, this is a talk for you.
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentHostedbyConfluent
Consuming messages in parallel is what Apache Kafka® is all about, so you may well wonder, why would we want anything else? It turns out that, in practice, there are a number of situations where Kafka’s partition-level parallelism gets in the way of optimal design.
This session will go over some of these types of situations that can benefit from parallel message processing within a single application instance (aka slow consumers or competing consumers), and then introduce the new Parallel Consumer labs project from Confluent, which can improve functionality and massively improve performance in such situations.
It will cover the
- Different ordering modes of the client
- Relative performance improvements
- Usage with other components like Kafka Streams
- An introduction to the internal architecture of the project
- How it can achieve all this in a reassignment friendly manner
Watch this talk here: https://www.confluent.io/online-talks/how-apache-kafka-works-on-demand
Pick up best practices for developing applications that use Apache Kafka, beginning with a high level code overview for a basic producer and consumer. From there we’ll cover strategies for building powerful stream processing applications, including high availability through replication, data retention policies, producer design and producer guarantees.
We’ll delve into the details of delivery guarantees, including exactly-once semantics, partition strategies and consumer group rebalances. The talk will finish with a discussion of compacted topics, troubleshooting strategies and a security overview.
This session is part 3 of 4 in our Fundamentals for Apache Kafka series.
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...HostedbyConfluent
Kafka and MemSQL are the perfect combination of speed, scale, and power to take on the world’s most complex operational analytics challenges. In this session, you will learn how Kafka and MemSQL have become the dynamic duo, and how you can use them together to achieve ingest of tens of millions of records per second and enable highly concurrent, real-time analytics. In the last few months, Kafka and MemSQL have been hard at work, devising a plan to take on the world’s next set of streaming data challenges. So stay tuned: there may just be an announcement!
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
Qlik is an industry leader across its solution stack, both on the Data Integration side of things with Qlik Replicate (real-time CDC) and Qlik Compose (data warehouse and data lake automation), and on the Analytics side with Qlik Sense. These two “sides” of Qlik are coming together more frequently these days as the need for “always fresh” data increases across organizations.
When real-time streaming applications are the topic du jour, those companies are looking to Apache Kafka to provide the architectural backbone those applications require. Those same companies turn to Qlik Replicate to put the data from their enterprise database systems into motion at scale, whether that data resides in “legacy” mainframe databases; traditional relational databases such as Oracle, MySQL, or SQL Server; or applications such as SAP and SalesForce.
In this session we will look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we will explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.
Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch...HostedbyConfluent
Robinhood uses Kafka in every line of its business, from stock and crypto trading to clearing and data analytics. One interesting aspect of our architecture is that many of our microservices leveraging Kafka are written in Python. When you combine Python's relatively slow performance coupled, its reliance on process-based parallelism and Robinhood’s scale, the result is a massive fleet of application processes producing to and consuming from our Kafka clusters. This fleet generates an atypical workload on Kafka that warrants a deeper investment in scalability and reliability.
This talk discusses our investments in Kafka infrastructure for a large-scale Python-based environment:
kafkahood: our librdkafka-based client library wrapper that codifies best practices, sane defaults and deep client-side observability.
kafkaproxy: a Rust-based sidecar proxy that reduces connection fan-in from Python gunicorn worker pools to our Kafka clusters.
We'll also present challenges we encountered along the way and share our learnings with the audience.
Utilizing Kafka Connect to Integrate Classic Monoliths into Modern Microservi...HostedbyConfluent
Having started with classic monolith applications in the late 90s and adopting a new microservice architecture in 2015, our organization needed a convenient, reliable, and low-cost way to push changes back and forth between them. One that preferably utilized technology already on hand and could exchange information between multiple data stores.
In this session we will explore how Kafka Connect and its various connectors satisfied this need. We will review the two disparate tech stacks we needed to integrate, and the strategies and connectors we used to exchange information. Finally, we will cover some enhancements we made to our own processes including integrating Kafka Connect and its connectors into our CI/CD pipeline and writing tools to monitor connectors in our production environment.
You have built an event-driven system leveraging Apache Kafka. Now you face the challenge of integrating traditional synchronous request-response capabilities, such as user interaction, through an HTTP web service.
There are various techniques, each with advantages and disadvantages. This talk discusses multiple options on how to do a request-response over Kafka — showcasing producers and consumers using single and multiple topics, and more advanced considerations using the interactive queries of ksqlDB and Kafka Streams.
Advanced considerations discussed:
What a consumer rebalance means to your active request-responses.
Discuss options for blocking for the async response in the web-service.
How can the CQRS (Command Query Responsibility Segregation) be leveraged with the interactive state stores of Kafka Streams and ksqlDB?
Interactive queries of the ksqlDB and Kafka Streams state stores are not available during a rebalance. What is the active Kafka development happening that will make interactive queries a more feasible option?
Would a custom state store help with rebalancing limitations?
Can custom partitioning be used for proper routing, and what impacts could that have to the other services in your ecosystem?
We will explore the above considerations with an interactive quiz application built using Apache Kafka, Kafka Streams, and ksqlDB. With a proper implementation in place, your request-response application can scale and be performant along with handling all of the requests.
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...HostedbyConfluent
Logging ingestion infrastructure at Pinterest is built around Apache Kafka to support thousands of pipelines with over 1 trillion (1PB) new messages generated by hundreds of services (written in 5 different languages) and transported to data lake (AWS S3) every day. In the past, we have focused on scalability and auto operation of the infrastructure to help internal teams quickly onboard new pipelines (Kafka Summit 2018, 2020). However, we had constantly observed data loss and data corruption due to the design decisions we made to favor scalability and availability over durability and consistency.
To tackle these problems, we designed and implemented logging auditing framework which consists of (1) audit client library integrated into every component of the infrastructure to detect data corruption for every message and send out audit events for randomly picked messages, (2) Kafka clusters receiving audit events, and (3) realtime and batch application processing audit events to generate insights for alerting and reporting.
Focusing on zero negative impact to existing ingestion pipelines, scalability and cost efficiency led us to make various design decisions to eventually achieve auditing rollout to every pipeline with zero downtime and fundamentally improve the data ingestion quality at Pinterest in general by tracking data loss and removing data corruption which in the past can block downstream applications for hours and often lead to severe incidents.
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...HostedbyConfluent
Do your event streams use connected-data domains such as fraud detection, live logistics routing, or predicting network outages? How can you maintain the analysis and leverage those connections real-time?
Graph databases differ from traditional, tabular ones in that they treat connections between data as first class citizens. This means they are optimized for detecting and understanding these relationships – providing insight at speed and at scale.
By combining event streams from Kafka along with the power of the Neo4j graph database for interrogating and investigating connections, you make real-time, event-driven intelligent insight a reality.
Neo4j Streams integrates Neo4j with Apache Kafka event streams, to serve as a source of data, for instance Change Data Capture or a sink to ingest any kind of Kafka event into your graph. In this session we’ll show you how to get up and running with Neo4j Streams to show you how to sink and source between graphs and streams.
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...HostedbyConfluent
In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA.
We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing.
We'll discuss:
1. How to architect a data sink for failed records.
2. How topic compaction can reduce duplicate data and enable idempotency.
3. Building a tombstoning system for removing successfully reprocessed records from the queues.
4. Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?
A stream processing platform is not an island unto itself; it must be connected to all of your existing data systems, applications, and sources. In this talk we will provide different options for integrating systems and applications with Apache Kafka, with a focus on the Kafka Connect framework and the ecosystem of Kafka connectors. We will discuss the intended use cases for Kafka Connect and share our experience and best practices for building large-scale data pipelines using Apache Kafka.
Integrating Apache Kafka and Elastic Using the Connect Frameworkconfluent
As a streaming platform, Apache Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and excels at processing streams of real-time events. Kafka provides reliable, millisecond delivery for connecting downstream systems with real-time data.
In this talk, we will show how easy it is to leverage Kafka and the Elasticsearch connector to keep your indices populated with the latest data from the rest of your enterprise, as it changes.
Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,...HostedbyConfluent
One of the great things about running applications in the cloud is that you only pay for the resources that you use. But that also makes it more important than ever for our applications to be resource-efficient. This becomes even more critical when we use serverless functions.
Micronaut is an application framework that provides dependency injection, developer productivity features, and excellent support for Apache Kafka. By performing dependency injection, AOP, and other productivity-enhancing magic at compile time, Micronaut allows us to build smaller, more efficient microservices and serverless functions.
In this session, we'll explore the ways that Apache Kafka and Micronaut work together to enable us to build fast, efficient, event-driven applications. Then we'll see it in action, using the AWS Lambda Sink Connector for Confluent Cloud.
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
* How to manage multi-tenant clusters in a hybrid environment
* High volume data pipelines with Mirus replicating data to Kafka and blob storage
* Kafka Fault Injection Framework built on Trogdor and Kibosh
* Automated recovery without data loss
* Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
What happened when our biggest and most important Kafka cluster went rogue all of a sudden, and while trying to recover it, a single, crucial misconfiguration made things even worse?
At a company like Taboola, where service availability and latency are our top priority, this was a disaster.
With 300K messages/sec and 250TB of messages produced each day to our on-premise Kafka clusters, and mirrored to our central Kafka cluster, we always try to ensure Kafka behaves well under high loads of traffic and unexpected cluster failures. So when our main Kafka cluster went crazy we had a serious issue on our hands.
This session is the story of how we learned the hard way about mitigating cluster failures with the proper configurations in place.
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...HostedbyConfluent
As a data professional, you are the glue that makes cross-platform integrations possible. With the increase in adoption of hybrid cloud architectures, Kafka is an increasingly relevant tool for building data pipelines between platforms and accelerating delivery on cloud projects. Early exposure to Kafka on Azure capabilities gives you an edge to build better mousetraps at the design phase.
Customers already running Kafka on premises and are looking to extend Kafka systems to Azure can get started quickly with Confluent Cloud. Additionally, DevOps for self-managed options can be easily scalable with Ansible for Virtual Machines or containers via Azure Kubernetes Services or Azure Container Instances.
This session is presented from the Microsoft Solution Architect perspective by Israel Ekpo, Microsoft Cloud Solution Architect and Alicia Moniz, Microsoft MVP. They will cover use cases and scenarios, along with key Azure integration points and architecture patterns.
Monitoring and Resiliency Testing our Apache Kafka Clusters at Goldman Sachs ...HostedbyConfluent
In our payments platform at Goldman Sachs Transaction Banking, Apache Kafka plays a critical role as the messaging bus in our micro-services architecture. Being a part of the financial service industry we need to ensure high-availability of our platform and quick response time during failures.
In this talk we will explore how we monitor and alert on the health of our Kafka clusters using our heartbeat application and clients using DataDog dashboards. We will see how we consolidate JMX metrics such as error-rates, connection-rates, latencies and consumer lag from all producers and consumers using JMX agent sidecar to provide a live view of the health of our entire infrastructure. We will also discuss our culture of game days where we regularly test the resiliency of all the clients in our infrastructure by simulating various failure scenarios to improve the overall availability of our infrastructure.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Presentation @ Oracle Code Berlin.
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can we make sure that all these events are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amounts of messages between a source and a target. This session will start with an introduction of Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table.
Indeed Flex: The Story of a Revolutionary Recruitment PlatformHostedbyConfluent
"This is a tale of two streams when the pandemic hit and how we changed with the times and built a revolutionary recruitment platform for “going into work”. We engaged employers, recruiters and job seekers from industrial, healthcare, retail, hospital and facilities management sectors by building a unique platform where the job seeker has full control to pick their schedule, pay rate and what meets their preferences. Our goals are to give job seekers and employers a platform that thrives on simplicity, transparency and low costs. The Flexer stands today with full control of their time at the edge of opportunities to thrive on.
This presentation will go into the details of how we are tearing down a monolithic platform piece by piece and building a robust architecture,
- Routing events between two platforms
- Many sources and,
- Consumed by downstream several applications
We will discuss the caveats and bugs we learned when we worked with schema registry and evolution of schemas. We will highlight improvements we gained from automation and observability with Datadog integration for Confluent Cloud.
If you’re in discussions surrounding event driven systems at your organization then this talk is for you. Join Ronak and me for this talk and let’s have a discussion."
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...HostedbyConfluent
Kafka and MemSQL are the perfect combination of speed, scale, and power to take on the world’s most complex operational analytics challenges. In this session, you will learn how Kafka and MemSQL have become the dynamic duo, and how you can use them together to achieve ingest of tens of millions of records per second and enable highly concurrent, real-time analytics. In the last few months, Kafka and MemSQL have been hard at work, devising a plan to take on the world’s next set of streaming data challenges. So stay tuned: there may just be an announcement!
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
Qlik is an industry leader across its solution stack, both on the Data Integration side of things with Qlik Replicate (real-time CDC) and Qlik Compose (data warehouse and data lake automation), and on the Analytics side with Qlik Sense. These two “sides” of Qlik are coming together more frequently these days as the need for “always fresh” data increases across organizations.
When real-time streaming applications are the topic du jour, those companies are looking to Apache Kafka to provide the architectural backbone those applications require. Those same companies turn to Qlik Replicate to put the data from their enterprise database systems into motion at scale, whether that data resides in “legacy” mainframe databases; traditional relational databases such as Oracle, MySQL, or SQL Server; or applications such as SAP and SalesForce.
In this session we will look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we will explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.
Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch...HostedbyConfluent
Robinhood uses Kafka in every line of its business, from stock and crypto trading to clearing and data analytics. One interesting aspect of our architecture is that many of our microservices leveraging Kafka are written in Python. When you combine Python's relatively slow performance coupled, its reliance on process-based parallelism and Robinhood’s scale, the result is a massive fleet of application processes producing to and consuming from our Kafka clusters. This fleet generates an atypical workload on Kafka that warrants a deeper investment in scalability and reliability.
This talk discusses our investments in Kafka infrastructure for a large-scale Python-based environment:
kafkahood: our librdkafka-based client library wrapper that codifies best practices, sane defaults and deep client-side observability.
kafkaproxy: a Rust-based sidecar proxy that reduces connection fan-in from Python gunicorn worker pools to our Kafka clusters.
We'll also present challenges we encountered along the way and share our learnings with the audience.
Utilizing Kafka Connect to Integrate Classic Monoliths into Modern Microservi...HostedbyConfluent
Having started with classic monolith applications in the late 90s and adopting a new microservice architecture in 2015, our organization needed a convenient, reliable, and low-cost way to push changes back and forth between them. One that preferably utilized technology already on hand and could exchange information between multiple data stores.
In this session we will explore how Kafka Connect and its various connectors satisfied this need. We will review the two disparate tech stacks we needed to integrate, and the strategies and connectors we used to exchange information. Finally, we will cover some enhancements we made to our own processes including integrating Kafka Connect and its connectors into our CI/CD pipeline and writing tools to monitor connectors in our production environment.
You have built an event-driven system leveraging Apache Kafka. Now you face the challenge of integrating traditional synchronous request-response capabilities, such as user interaction, through an HTTP web service.
There are various techniques, each with advantages and disadvantages. This talk discusses multiple options on how to do a request-response over Kafka — showcasing producers and consumers using single and multiple topics, and more advanced considerations using the interactive queries of ksqlDB and Kafka Streams.
Advanced considerations discussed:
What a consumer rebalance means to your active request-responses.
Discuss options for blocking for the async response in the web-service.
How can the CQRS (Command Query Responsibility Segregation) be leveraged with the interactive state stores of Kafka Streams and ksqlDB?
Interactive queries of the ksqlDB and Kafka Streams state stores are not available during a rebalance. What is the active Kafka development happening that will make interactive queries a more feasible option?
Would a custom state store help with rebalancing limitations?
Can custom partitioning be used for proper routing, and what impacts could that have to the other services in your ecosystem?
We will explore the above considerations with an interactive quiz application built using Apache Kafka, Kafka Streams, and ksqlDB. With a proper implementation in place, your request-response application can scale and be performant along with handling all of the requests.
How did we move the mountain? - Migrating 1 trillion+ messages per day across...HostedbyConfluent
Have you ever migrated Kafka clusters from one data center to another being completely transparent to client applications?
At PayPal, as part of a massive datacenter migration initiative, Kafka team successfully moved all PayPal Kafka traffic across data centers. This initiative involved migrating 20+ Kafka clusters (1000+ broker and zookeeper nodes), as well as 60+ mirrormaker groups which seamlessly handle Kafka traffic volumes as high as 1 trillion messages per day. Throughout the course of this migration, applications required no modification, encountered 0% service outage, 0% message loss and duplicated messages. The whole migration process was fully transparent to Kafka applications.
In this session, you will learn the strategies, techniques and tools the PayPal Kafka team has utilized for managing the migration process. You will also learn the lessons and pitfalls they experienced during this exercise, as well as the secret sauce of making the migration successful.
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...HostedbyConfluent
Logging ingestion infrastructure at Pinterest is built around Apache Kafka to support thousands of pipelines with over 1 trillion (1PB) new messages generated by hundreds of services (written in 5 different languages) and transported to data lake (AWS S3) every day. In the past, we have focused on scalability and auto operation of the infrastructure to help internal teams quickly onboard new pipelines (Kafka Summit 2018, 2020). However, we had constantly observed data loss and data corruption due to the design decisions we made to favor scalability and availability over durability and consistency.
To tackle these problems, we designed and implemented logging auditing framework which consists of (1) audit client library integrated into every component of the infrastructure to detect data corruption for every message and send out audit events for randomly picked messages, (2) Kafka clusters receiving audit events, and (3) realtime and batch application processing audit events to generate insights for alerting and reporting.
Focusing on zero negative impact to existing ingestion pipelines, scalability and cost efficiency led us to make various design decisions to eventually achieve auditing rollout to every pipeline with zero downtime and fundamentally improve the data ingestion quality at Pinterest in general by tracking data loss and removing data corruption which in the past can block downstream applications for hours and often lead to severe incidents.
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...HostedbyConfluent
Do your event streams use connected-data domains such as fraud detection, live logistics routing, or predicting network outages? How can you maintain the analysis and leverage those connections real-time?
Graph databases differ from traditional, tabular ones in that they treat connections between data as first class citizens. This means they are optimized for detecting and understanding these relationships – providing insight at speed and at scale.
By combining event streams from Kafka along with the power of the Neo4j graph database for interrogating and investigating connections, you make real-time, event-driven intelligent insight a reality.
Neo4j Streams integrates Neo4j with Apache Kafka event streams, to serve as a source of data, for instance Change Data Capture or a sink to ingest any kind of Kafka event into your graph. In this session we’ll show you how to get up and running with Neo4j Streams to show you how to sink and source between graphs and streams.
Building Retry Architectures in Kafka with Compacted Topics | Matthew Zhou, V...HostedbyConfluent
In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA.
We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing.
We'll discuss:
1. How to architect a data sink for failed records.
2. How topic compaction can reduce duplicate data and enable idempotency.
3. Building a tombstoning system for removing successfully reprocessed records from the queues.
4. Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?
A stream processing platform is not an island unto itself; it must be connected to all of your existing data systems, applications, and sources. In this talk we will provide different options for integrating systems and applications with Apache Kafka, with a focus on the Kafka Connect framework and the ecosystem of Kafka connectors. We will discuss the intended use cases for Kafka Connect and share our experience and best practices for building large-scale data pipelines using Apache Kafka.
Integrating Apache Kafka and Elastic Using the Connect Frameworkconfluent
As a streaming platform, Apache Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and excels at processing streams of real-time events. Kafka provides reliable, millisecond delivery for connecting downstream systems with real-time data.
In this talk, we will show how easy it is to leverage Kafka and the Elasticsearch connector to keep your indices populated with the latest data from the rest of your enterprise, as it changes.
Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,...HostedbyConfluent
One of the great things about running applications in the cloud is that you only pay for the resources that you use. But that also makes it more important than ever for our applications to be resource-efficient. This becomes even more critical when we use serverless functions.
Micronaut is an application framework that provides dependency injection, developer productivity features, and excellent support for Apache Kafka. By performing dependency injection, AOP, and other productivity-enhancing magic at compile time, Micronaut allows us to build smaller, more efficient microservices and serverless functions.
In this session, we'll explore the ways that Apache Kafka and Micronaut work together to enable us to build fast, efficient, event-driven applications. Then we'll see it in action, using the AWS Lambda Sink Connector for Confluent Cloud.
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
* How to manage multi-tenant clusters in a hybrid environment
* High volume data pipelines with Mirus replicating data to Kafka and blob storage
* Kafka Fault Injection Framework built on Trogdor and Kibosh
* Automated recovery without data loss
* Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
What happened when our biggest and most important Kafka cluster went rogue all of a sudden, and while trying to recover it, a single, crucial misconfiguration made things even worse?
At a company like Taboola, where service availability and latency are our top priority, this was a disaster.
With 300K messages/sec and 250TB of messages produced each day to our on-premise Kafka clusters, and mirrored to our central Kafka cluster, we always try to ensure Kafka behaves well under high loads of traffic and unexpected cluster failures. So when our main Kafka cluster went crazy we had a serious issue on our hands.
This session is the story of how we learned the hard way about mitigating cluster failures with the proper configurations in place.
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...HostedbyConfluent
As a data professional, you are the glue that makes cross-platform integrations possible. With the increase in adoption of hybrid cloud architectures, Kafka is an increasingly relevant tool for building data pipelines between platforms and accelerating delivery on cloud projects. Early exposure to Kafka on Azure capabilities gives you an edge to build better mousetraps at the design phase.
Customers already running Kafka on premises and are looking to extend Kafka systems to Azure can get started quickly with Confluent Cloud. Additionally, DevOps for self-managed options can be easily scalable with Ansible for Virtual Machines or containers via Azure Kubernetes Services or Azure Container Instances.
This session is presented from the Microsoft Solution Architect perspective by Israel Ekpo, Microsoft Cloud Solution Architect and Alicia Moniz, Microsoft MVP. They will cover use cases and scenarios, along with key Azure integration points and architecture patterns.
Monitoring and Resiliency Testing our Apache Kafka Clusters at Goldman Sachs ...HostedbyConfluent
In our payments platform at Goldman Sachs Transaction Banking, Apache Kafka plays a critical role as the messaging bus in our micro-services architecture. Being a part of the financial service industry we need to ensure high-availability of our platform and quick response time during failures.
In this talk we will explore how we monitor and alert on the health of our Kafka clusters using our heartbeat application and clients using DataDog dashboards. We will see how we consolidate JMX metrics such as error-rates, connection-rates, latencies and consumer lag from all producers and consumers using JMX agent sidecar to provide a live view of the health of our entire infrastructure. We will also discuss our culture of game days where we regularly test the resiliency of all the clients in our infrastructure by simulating various failure scenarios to improve the overall availability of our infrastructure.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Presentation @ Oracle Code Berlin.
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can we make sure that all these events are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amounts of messages between a source and a target. This session will start with an introduction of Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table.
Indeed Flex: The Story of a Revolutionary Recruitment PlatformHostedbyConfluent
"This is a tale of two streams when the pandemic hit and how we changed with the times and built a revolutionary recruitment platform for “going into work”. We engaged employers, recruiters and job seekers from industrial, healthcare, retail, hospital and facilities management sectors by building a unique platform where the job seeker has full control to pick their schedule, pay rate and what meets their preferences. Our goals are to give job seekers and employers a platform that thrives on simplicity, transparency and low costs. The Flexer stands today with full control of their time at the edge of opportunities to thrive on.
This presentation will go into the details of how we are tearing down a monolithic platform piece by piece and building a robust architecture,
- Routing events between two platforms
- Many sources and,
- Consumed by downstream several applications
We will discuss the caveats and bugs we learned when we worked with schema registry and evolution of schemas. We will highlight improvements we gained from automation and observability with Datadog integration for Confluent Cloud.
If you’re in discussions surrounding event driven systems at your organization then this talk is for you. Join Ronak and me for this talk and let’s have a discussion."
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
A presentation cum workshop on Real time Analytics with Apache Kafka and Apache Spark. Apache Kafka is a distributed publish-subscribe messaging while other side Spark Streaming brings Spark's language-integrated API to stream processing, allows to write streaming applications very quickly and easily. It supports both Java and Scala. In this workshop we are going to explore Apache Kafka, Zookeeper and Spark with a Web click streaming example using Spark Streaming. A clickstream is the recording of the parts of the screen a computer user clicks on while web browsing.
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
Developing Realtime Data Pipelines With Apache Kafka. Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Common issues with Apache Kafka® Producerconfluent
Badai Aqrandista, Confluent, Senior Technical Support Engineer
This session will be about a common issue in the Kafka Producer: producer batch expiry. We will be discussing the Kafka Producer internals, its common causes, such as a slow network or small batching, and how to overcome them. We will also be sharing some examples along the way!
https://www.meetup.com/apache-kafka-sydney/events/279651982/
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE confluent
(Yuto Kawamura, LINE Corporation) Kafka Summit SF 2018
LINE is a messaging service with 160+ million active users. Last year I talked about how we operate our Kafka cluster that receives more than 160 billion messages daily, dealing with performance problems to meet our tight requirement. Since last year we have deployed three more new clusters each for different purposes, such as one in different datacenter, one for security sensitive usages and so on, still keeping the fundamental concept: one cluster for everyone to use. While letting many projects using few multi-tenancy clusters greatly saves our operational cost and enables us to concentrate our engineering resources for maximizing their reliability, hosting multiple topics of different kinds of workload led us through a lot of challenges, too.
In this talk I will introduce how we operate Kafka clusters shared among different services, solving troubles we met to maximize its reliability. Especially, one of the most critical issues we’ve solved—delayed consumer Fetch request causing a broker’s network threads to be blocked—should be very interesting because it could have worse overall performance of brokers in a very common situation, and we have managed to solve it leveraging advanced technique such as dynamic tracing and tricky patch to control in-kernel behavior from Java code.
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
In this presentation Guido Schmutz talks about Apache Kafka, Kafka Core, Kafka Connect, Kafka Streams, Kafka and "Big Data"/"Fast Data Ecosystems, Confluent Data Platform and Kafka in Architecture.
Full recorded presentation at https://www.youtube.com/watch?v=2UfAgCSKPZo for Tetrate Tech Talks on 2022/05/13.
Envoy's support for Kafka protocol, in form of broker-filter and mesh-filter.
Contents:
- overview of Kafka (usecases, partitioning, producer/consumer, protocol);
- proxying Kafka (non-Envoy specific);
- proxying Kafka with Envoy;
- handling Kafka protocol in Envoy;
- Kafka-broker-filter for per-connection proxying;
- Kafka-mesh-filter to provide front proxy for multiple Kafka clusters.
References:
- https://adam-kotwasinski.medium.com/deploying-envoy-and-kafka-8aa7513ec0a0
- https://adam-kotwasinski.medium.com/kafka-mesh-filter-in-envoy-a70b3aefcdef
Presentation from kafka meetup 13-SEP-2013. including some notes to clarify some slides. enjoy
Avi Levi
123avi@gmail.com
https://www.linkedin.com/in/leviavi/
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
ndependent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. How can me make sure that all these event are accepted and forwarded in an efficient and reliable way? This is where Apache Kafaka comes into play, a distirbuted, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target.
This session will start with an introduction into Apache and presents the role of Apache Kafka in a modern data / information architecture and the advantages it brings to the table. Additionally the Kafka ecosystem will be covered as well as the integration of Kafka in the Oracle Stack, with products such as Golden Gate, Service Bus and Oracle Stream Analytics all being able to act as a Kafka consumer or producer.
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
"In this talk, attendees will be provided with an introduction to Kafka Connect and the basics of Single Message Transforms (SMTs) and how they can be used to transform data streams in a simple and efficient way. SMTs are a powerful feature of Kafka Connect that allow custom logic to be applied to individual messages as they pass through the data pipeline. The session will explain how SMTs work, the types of transformations they can be used for, and how they can be applied in a modular and composable way.
Further, the session will discuss where SMTs fit in with Kafka Connect and when they should be used. Examples will be provided of how SMTs can be used to solve common data integration challenges, such as data enrichment, filtering, and restructuring. Attendees will also learn about the limitations of SMTs and when it might be more appropriate to use other tools or frameworks.
Additionally, an overview of the alternatives to SMTs, such as Kafka Streams and KSQL, will be provided. This will help attendees make an informed decision about which approach is best for their specific use case.
Whether attendees are developers, data engineers, or data scientists, this talk will provide valuable insights into how Kafka Connect and SMTs can help streamline data processing workflows. Attendees will come away with a better understanding of how these tools work and how they can be used to solve common data integration challenges."
"While Apache Kafka lacks native support for topic renaming, there are scenarios where renaming topics becomes necessary. This presentation will delve into the utilization of MirrorMaker 2.0 as a solution for renaming Kafka topics. It will illustrate how MirrorMaker 2.0 can efficiently facilitate the migration of messages from the old topic to the new one and how Kafka Connect Metrics can be employed to monitor the mirroring progress. The discussion will encompass the complexity of renaming Kafka topics, addressing certain limitations, and exploring potential workarounds when using MirrorMaker 2.0 for this purpose. Despite not being originally designed for topic renaming, MirrorMaker 2.0 has a suitable solution for renaming Kafka topics.
Blog Post : https://engineering.hellofresh.com/renaming-a-kafka-topic-d6ff3aaf3f03"
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
"Trendyol, Turkey's leading e-commerce company, is committed to positively impacting the lives of millions of customers. Our decision-making processes are entirely driven by data. As a data warehouse team, our primary goal is to provide accurate and up-to-date data, enabling the extraction of valuable business insights.
We utilize the benefits provided by Kafka and Kafka Connect to facilitate the transfer of data from the source to our analytical environment. We recently transitioned our Kafka Connect clusters from on-premise VMs to Kubernetes. This shift was driven by our desire to effectively manage rapid growth(marked by a growing number of producers, consumers, and daily messages), ensuring proper monitoring and consistency. Consistency is crucial, especially in instances where we employ Single Message Transforms to manipulate records like filtering based on their keys or converting a JSON Object into a JSON string.
Monitoring our cluster's health is key and we achieve this through Grafana dashboards and alerts generated through kube-state-metrics. Additionally, Kafka Connect's JMX metrics, coupled with NewRelic, are employed for comprehensive monitoring.
The session will aim to explain our approach to NRT data ingestion, outlining the role of Kafka and Kafka Connect, our transition journey to K8s, and methods employed to monitor the health of our clusters."
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
"Join our lightning talk to delve into the strategies vital for maintaining a resilient Kafka service.
While proactive monitoring is key for issue prevention, failures will still occur. Rapid detection tools will enable you to identify and resolve problems before they impact end-users. This session explores the techniques employed by Kafka cloud providers for this detection, many of which are also applicable if you are managing independent Kafka clusters or applications.
The talk focuses on health-checking, a powerful tool that encompasses an application and its monitoring to validate Kafka environment availability. The session navigates through Kafka health-check methods, sharing best practices, identifying common pitfalls, and highlighting the monitoring of critical performance metrics like throughput and latency for early issue detection.
Attendees will gain valuable insights into the art of health-checking their Kafka environment, equipping them with the tools to identify and address issues before they escalate into critical problems. We invite all Kafka enthusiasts to join us in this talk to foster a deeper understanding of Kafka health-checking and ensure the continued smooth operation of your Kafka environment."
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
"Stream processing systems traditionally gave their users the choice between at least once processing and at most once processing: accepting duplicate data or missing data. But ideally we would provide exactly-once processing, where every event in the input data is represented exactly once in the output.
Kafka provides a transaction API that enables exactly-once when using Kafka as your source and sink. But this API has turned out to not be well suited for use by high level streaming systems, requiring various work arounds to still provide transactional processing.
In this talk, I’ll cover how the transaction API works, and how systems like Arroyo and Flink have used it to build exactly-once support, and how improvements to the transactional API will enable better end-to-end support for consistent stream processing."
"In this talk, we will explore the exciting world of IoT and computer vision by presenting a unique project: Fish Plays Pokemon. Using an ESP Eye camera connected to an ESP32 and other IoT devices, to monitor fish's movements in an aquarium.
This project showcases the power of IoT and computer vision, demonstrating how even a fish can play a popular video game. We will discuss the challenges we faced during development, including real-time processing, IoT device integration, and Kafka message consumption.
By the end of the talk, attendees will have a better understanding of how to combine IoT, computer vision, and the usage of a serverless cloud to create innovative projects. They will also learn how to integrate IoT devices with Kafka to simulate keyboard behavior, opening up endless possibilities for real-time interactions between the physical and digital worlds."
What is tiered storage and what is it good for? After this session you will know how to leverage the tiered storage feature to enable longer retention than the storage attached to brokers allows. You will get acquainted with the different configuration options and know what to expect when you enable the feature, like for example when will the first upload to the remote object storage take place.
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
"Real-time 24/7 monitoring and verification of massive data is challenging – even more so for the world’s second largest manufacturer of memory chips and semiconductors. Tolerance levels are incredibly small, any small defect needs to be identified and dealt with immediately. The goal of semiconductor manufacturing is to improve yield and minimize unnecessary work.
However, even with real-time data collection, the data was not easy to manipulate by users and it took many days to enable stream processing requests – limiting its usefulness and value to the business.
You’ll hear why SK hynix switched to Confluent and how we developed a self-service stream process portal on top of it. Now users have an easy-to-use service to manipulate the data they want.
Results have been impressive, stream processing requests are available the same day – previously taking 5 days! We were also able to drive down costs by 10% as stream processing requests no longer require additional hardware.
What you’ll take away from our talk:
- What were the pain points in the previous environment
- How we transitioned to Confluent without service downtime
- Creating a self-service stream processing portal built on top of Connect and ksqlDB
- Use case of stream process portal"
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
"Discover how default configurations might impact ingestion times, especially when dealing with large files. We'll explore a real-world scenario with a 20,000,000+ line file, assessing metrics and exploring the bottleneck in the default setup. Understand the intricacies of batch size calculations and how to optimize them based on your unique data characteristics.
Walk away with actionable insights as we showcase a practical example, turning a 7-hour ingestion process into a mere 30 minutes for over 30,000,000 records in a Kafka topic. Uncover metrics, configurations, and best practices to elevate the performance of your Kafka Connect CSV source connectors. Don't miss this opportunity to optimize your data pipeline and ensure smooth, efficient data flow."
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
"In order to meet the current and ever-increasing demand for near-zero RPO/RTO systems, a focus on resiliency is critical. While Kafka offers built-in resiliency features, a perfect blend of client and cluster resiliency is necessary in order to achieve a highly resilient Kafka client application.
At Fidelity Investments, Kafka is used for a variety of event streaming needs such as core brokerage trading platforms, log aggregation, communication platforms, and data migrations. In this lightening talk, we will discuss the governance framework that has enabled producers and consumers to achieve their SLAs during unprecedented failure scenarios. We will highlight how we automated resiliency tests through chaos engineering and tightly integrated observability dashboards for Kafka clients to analyze and optimize client configurations. And finally, we will summarize the chaos test suite and the ""test, test and test"" mantra that are helping Fidelity Investments reach its goal of a future with zero down-time."
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
"There are various strategies for securely connecting to Kafka clusters between different networks or over the public internet. Many cloud providers even offer endpoints that privately route traffic between networks and are not exposed to the internet. But, depending on your network setup and how you are running Kafka, these options ... might not be an option!
In this session, we’ll discuss how you can use SSH bastions or a self managed PrivateLink endpoint to establish connectivity to your Kafka clusters without exposing brokers directly to the internet. We explain the required network configuration, and show how we at Materialize have contributed to librdkafka to simplify these scenarios and avoid fragile workarounds."
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
"In my talk, we will examine all the stages of building our self-service Streaming Data Platform based on Apache Flink and Kafka Connect, from the selection of a solution for stateful streaming data processing, right up to the successful design of a robust self-service platform, covering the challenges that we’ve met.
I will share our experience in providing non-Java developers with a company-wide self-service solution, which allows them to quickly and easily develop their streaming data pipelines.
Additionally, I will highlight specific business use cases that would not have been implemented without our platform.0 characters0 characters"
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
"Almost everyone has heard about large language models, and tens of millions of people have tried out OpenAI ChatGPT and Google Bard. However, the intricate architecture and underlying mathematics driving these remarkable systems remain elusive to many.
LLM's are fascinating - so let's grab a drink and find out how these systems are built and dive deep into their inner workings. In the length of time it to enjoy a round of drinks, you'll understand the inner workings of these models. We'll take our first sip of word vectors, enjoy the refreshing taste of the transformer, and drain a glass understanding how these models are trained on phenomenally large quantities of data.
Large language models for your streaming application - explained with a little maths and a lot of pub stories"
"Monitoring is a fundamental operation when running Kafka and Kafka applications in production. There are numerous metrics available when using Kafka, however the sheer number is overwhelming, making it challenging to know where to start and how to properly utilise them.
This session will introduce you to some of the key metrics that should be monitored and best practices in fine tuning your monitoring. We will delve into which metrics are the indicators for cluster’s availability and performance and are the most helpful when debugging client applications."
Kafka Streams relies on state restoration for maintaining standby tasks as failure recovery mechanism as well as for restoring the state after rebalance scenarios. When you are scaling up or down your application instances, it is necessary to know the current state of the restoration process for each active and standby task in order to prevent a long restoration process as much as possible. During this presentation, you will get an understanding of how KIP-869 provides valuable information about the current active task restoration after a rebalance and KIP-988 opens a window to the continuous process of standby restoration. When you encounter a situation in which you need to choose whether or not to scale up or down your application instances, both KIPs will be an invaluable ally for you.
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
"In this talk, we will dive into the world of Kafka producer configs and explore how to understand and optimize them for better performance. We will cover the different types of configs, their impact on performance, and how to tune them to achieve the best results. Whether you're new to Kafka or a seasoned pro, this session will provide valuable insights and practical tips for improving your Kafka producer performance.
- Introduction to Kafka producer internal and workflow
- Understanding the producer configs like linger.ms, batch.size, buffer.memory and their impact on performance
- Learning about producer configs like max.block.ms, delivery.timeout.ms, request.timeout.ms and retries to make producer more resilient.
- Discuss configs like enable.idempotence, max.in.flight.requests.per.connection and transaction related configs to achieve delivery guarantees.
- Q&A session with attendees to address specific questions and concerns."
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
"Data contracts are one of the hottest topics in the data management community. A data contract is a formal agreement between a data producer and its consumers, aimed at reducing data downtime and improving data quality. Schemas are an important part of data contracts, but they are not the only relevant element.
In this talk, we’ll:
1. see why data contracts are so important but also difficult to implement;
2. identify the characteristics of a well-designed data contract:
discuss the anatomy of a data contract, its main elements and, how to formally describe them;
3. show how to manage the lifecycle of a data contract leveraging Confluent Platform's services."
"In the realm of stateful stream processing, Apache Flink has emerged as a powerful and versatile platform. However, the conventional SQL-based approach often limits the full potential of Flink applications.
We will delve into the benefits of adopting a code-first approach, which provides developers with greater control over application logic, facilitates complex transformations, and enables more efficient handling of state and time. We will also discuss how the code-first approach can lead to more maintainable and testable code, ultimately improving the overall quality of your Flink applications.
Whether you're a seasoned Flink developer or just starting your journey, this talk will provide valuable insights into how a code-first approach can revolutionize your stream processing applications."
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
"Change Data Capture (CDC) has become a commodity in data engineering, much in part due to the ever-rising success of Debezium [1]. But is that all there is? In this lightning talk, we’ll outline the current state of the CDC ecosystem, and understand why adopting a Debezium alternative is still a hard sell. If you’ve ever wondered what else is out there, but can’t keep up with the sprawling of new tools in the ecosystem; we’ll wrap it up for you!
[1] https://debezium.io/"
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
"Separation of compute and storage has become the de-facto standard in the data industry for batch processing.
The addition of tiered storage to open source Apache Kafka is the first step in bringing true separation of compute and storage to the streaming world.
In this talk, we'll discuss in technical detail how to take the concept of tiered storage to its logical extreme by building an Apache Kafka protocol compatible system that has zero local disks.
Eliminating all local disks in the system requires not only separating storage from compute, but also separating data from metadata. This is a monumental task that requires reimagining Kafka's architecture from the ground up, but the benefits are worth it.
This approach enables a stateless, elastic, and serverless deployment model that minimizes operational overhead and also drives inter-zone networking costs to almost zero."
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
2. Context
- My name is Mik Kocikowski
- I work on Data Team at Cloudflare
- We collect and process petabytes of logs per day
- We use Kafka to buffer these logs
3. Ballpark scale
- Petabytes of data per day
- Hundreds of Kafka brokers in "few" clusters
- Thousands of topic-partitions
- Read 3x input
4. The problem
- At scale, client resource use was unpredictable
- At scale, client state was incomprehensible
5. The solution
- Understand what (and why) the client actually does
- Implement the simplest possible client that meets our use case
6. Talk trajectory
- Go through basic nomenclature
- Describe the wire protocol
- Show how api calls can result in complex client state
- Simplify client for predictable behavior and resource use
7. Clients talk with brokers using the “wire protocol”
- https://kafka.apache.org/protocol
- Proprietary (but simple) asynchronous protocol over TCP
8. API “keys”
- https://kafka.apache.org/protocol#protocol_api_keys
- Numbers that identify different API calls
- Produce == 0
- Fetch == 1 (“consume”)
- Metadata == 3
- ApiVersions == 18
- Most keys have multiple versions
- Some calls can be made to any broker
- Others need to be made to the partition leader or group coordinator
9. API “messages” (requests)
- https://kafka.apache.org/protocol#protocol_messages
- Bodies of the requests and responses
- Nested structs with simple binary marshaling
10. Record batches
- https://kafka.apache.org/documentation/#recordbatch
- Struct that encapsulates user data
- 1 or more records (data) + their metadata
- The unit at which data is compressed, stored, and retrieved
- Kafka >=0.11 (before that “message sets”)
- "Sweet spot" for us "few MB" per batch
11. High level produce-fetch
- Records (user data) are collected into record batches
- Record batches are sent to kafka via "Produce" requests (API key 0)
- Record batches are retrieved from kafka via "Fetch" requests (API key 1)
12. High level “produce” flow (client perspective)
1. Connect to a random broker ("bootstrap")
2. Make a “Metadata” (key 3) call to get list of partition leaders for topic
3. Connect to brokers that lead individual partitions
4. Make “Produce” requests (key 0) to individual brokers
5. Goto #4
13. High level “fetch” flow (client perspective)
1. Connect to a random broker ("bootstrap")
2. Make a “Metadata” (key 3) call to get list of partition leaders for topic
3. Connect to brokers that lead individual partitions
4. Make “Fetch” (key 1) requests to individual brokers
5. Goto #4
14. Produce (key 0) requests
- https://kafka.apache.org/protocol#The_Messages_Produce
- Single request can carry data for multiple topics and partitions
- Broker must be leader for every topic-partition
- For every topic-partition 1 record batch is sent per request
- “acks” and “timeout_ms” apply to the whole request
15. Produce v7 request
Produce Request (Version: 7) => transactional_id acks timeout
[topic_data]
transactional_id => NULLABLE_STRING
acks => INT16
timeout => INT32
topic_data => topic [data]
topic => STRING
data => partition record_set
partition => INT32
record_set => RECORDS
16. Produce (key 0) responses
- https://kafka.apache.org/protocol#The_Messages_Produce
- Success or failure are per topic-partition
- Partial failures possible
- fail to ack replication
- broker not leader for topic-partition
18. Fetch (key 1) requests
- https://kafka.apache.org/protocol#The_Messages_Fetch
- Single request can be for data from multiple topics and partitions
- Broker must be leader for every topic-partition
- Offset must be specified for every topic-partition
- “max_wait_ms” and “min_bytes” apply to the whole request
19. Fetch (key 1) responses
- https://kafka.apache.org/protocol#The_Messages_Fetch
- There will be 0 or more record batches for each successful topic-partition
- Record batch is the unit at which data is returned (offset alignment)
- Success or failure are per topic-partition
- Partial failures possible
20. Broker connections
- Clients maintain one or more connections to each broker they talk to
- Connections in general are long lived
- Calls are asynchronous identified by “correlation id”
21. Client state can become complex
- Multiple async requests awaiting responses
- Multiple topics per request
- Multiple partitions per topic
- … any of which can be “slow” or “broken”
22. Complex client state is bad
- More resources required (memory, cpu)
- Error handling and “retry” logic is convoluted
- Troubleshooting is hard (what exactly is slow / broken?)
23. Client state can be simplified
- Separate connection for each topic-partition
- All requests synchronous
24. Simple client state is good
- Predictable per topic-partition resource use
- Binary error handling
- Troubleshooting is easier (isolate problems at connection level)
25. In practice
- We wrote our own kafka client (golang)
- In production for over a year
- Processes petabytes of data every day
- Something goes wrong all the time but:
- Resource consumption remains predictable
- Errors are easily traceable
https://github.com/mkocikowski/libkafka
26. Conclusion
- Simplicity is a requisite of scale
- Kafka at its core is simple
- Clients that follow the java client design are complex
Thank you!
27. Bonus point: individual records and offsets
- The unit at which Kafka operates is a record batch
- 1 or more records per record batch
- Compression is applied per record batch
- Fetch requests most efficient when aligned to record batch boundaries
- Our client operates on record batches not on individual records