Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov and Michael Ng, Confluent) Kafka Summit NYC 2019

•

3 likes•1,653 views

When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer: Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Michael and Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes. NOTE: This talk will be delivered with Michael Ng, product manager, Confluent

Technology

@gamussa | #kafkasummit | @ConfluentINc
Kafka on Kubernetes:
Does it really have to be
«The Hard Way»?
April, 2019 / New York, 2019
@gamussa | #kafkasummit | @ConfluentINc

Raffle, yeah 🚀
Follow @gamussa @confluentinc
📸 🖼 👬
Tag @gamussa
With #kafkasummit

@gamussa | #kafkasummit | @ConfluentINc
3
Evolution of #devkafkaops
Shell scripts
ansible/chef
Docker
Kubernetes

@gamussa | #kafkasummit | @ConfluentINc
4

@gamussa | #kafkasummit | @ConfluentINc
5
🙋

@gamussa | #kafkasummit | @ConfluentINc
6
Who run stateless
workloads in Kubernetes?
Who thinks it’s a good
idea?
Who run stateful
workloads in Kubernetes?
Who thinks it’s a good
idea?
🙋

@gamussa | #kafkasummit | @ConfluentINc
7
kafkaesque world of Kafka on
Kubernetes

@gamussa | #kafkasummit | @ConfluentINc
8
Well, it’s tricky ©
Translating an existing architecture to Kubernetes
External access to brokers and other components
Persistent Storage options on prem and clouds
Security Configuration and Upgrades
#devkafkaops

@gamussa | #kafkasummit | @ConfluentINc
9
We just need to
deploy Kafka on
Kubernetes

@gamussa | #kafkasummit | @ConfluentINc
10
We will use
confluentinc/cp-helm-
charts

@gamussa | #kafkasummit | @ConfluentINc
11
Helm Charts is just a GO
Templates.
How Charts help with
rolling restart?

@gamussa | #kafkasummit | @ConfluentINc
12
We will use
StatefulSets
with OrderedReady

@gamussa | #kafkasummit | @ConfluentINc
13
We need SRE /
Operator knowledge to
manage the platform.
You need Operator!

@gamussa | #kafkasummit | @ConfluentINc
14

@gamussa | #kafkasummit | @ConfluentINc
15
Show me your
Operator

@gamussa | #kafkasummit | @ConfluentINc
16
Demo

@gamussa | #kafkasummit | @ConfluentINc
17
DO KAFKA ON KUBERNETES DEMO
AND EVERYONE LOOSES THEIR MIND

@gamussa | #kafkasummit | @ConfluentINc
18
What just happened?
ZK and Kafka deployed
Security with TLS is configured
External access is configured
Monitoring is enabled

@gamussa | #kafkasummit | @ConfluentINc
19
Confluent Operator - Automated
Security Configuration
SASL PLAIN and Mutual TLS Authentication
Automate configuration of truststores and
keystores with secret objects
Automate configuration of Kafka and all
Confluent Platform Components

@gamussa | #kafkasummit | @ConfluentINc
20
Confluent Operator - Scale
Automate Scaling:
Spin up new brokers, connect workers easily
Distribute partitions to new brokers:
Determine balancing plan
Execute balancing plan
Monitor Resources

@gamussa | #kafkasummit | @ConfluentINc
21
Be like Justin!

@gamussa | #kafkasummit | @ConfluentINc
22
Rolling Upgrade
Kafka Broker Upgrades:
1. Stop the broker, upgrade
Kafka
2. Wait for Partition Leader
reassignment
3. Start the upgraded
broker
4. Wait for zero under-
replicated partitions
5. Upgrade the next broker

@gamussa | #kafkasummit | @ConfluentINc
23
Will it fly?
vs.

@gamussa | #kafkasummit | @ConfluentINc
24
GA Plans● We are in private Preview
Release now
● 24 customers testing the
Operator in Preview:
● Global customers
● Banks, Fin Tech,
Retailers, Consumer Tech
● We are in the final
stages of Preview and
about to launch soon

@@gamussa | #kafkasummit | @ConfluentINc
Thanks!
@gamussa
viktor@confluent.io
michael.ng@confluent.io
https://slackpass.io/confluentcommunity
#kubernetes

While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session I will share our lessons learned and top tips for building a Kafka Connect connector. I’ll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.

Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SY...

confluent

Contrary to RPC-like applications, where communication and dependencies are explicitly defined between Services; data flowing between Event-Driven Applications is defined by how do they react to and emit events. A trade-off between Data-flow explicitness and Service autonomy becomes apparent between this two architectural-styles. The goal in this presentation is to demonstrate how Distributed-Tracing can help to cope with this trade-off, turning messaging exchange between decoupled, autonomous, Event-Driven Services, into explicit Data-flows. Zipkin project brings a Distributed-Tracing infrastructure that enables the collection, processing, and visualization of traces produced by RPC-based, as well as messaging-based applications. This presentation includes demonstrations on how to enable Tracing for Kafka Streams applications, Kafka Connectors, and KSQL; evidencing how implicit Services behavior and communication through the event-log become can become explicit via Distributed-Tracing. But collecting and visualizing traces is just the first step. In order to create insights from tracing-data, models has to be built to enable an better understanding from the system, and improve our operational capabilities. Including research-based experiences from Netflix[1] and Facebook[2] on how tracing-data has been processed and polished with multiple purposes, this presentation will cover how service-dependency analysis and anomaly-detection models can be built on top of it.

Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov,...

confluent

When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer - Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Michael and Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes. NOTE: This talk together with Michael Ng from Confluent

Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...

confluent

While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session we will share our lessons learned and top tips for building a Kafka Connect connector. We'll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.

Kubernetes Apache Kafka

confluent

Speaker: Frank Pientka, Principal Software Architect, Materna Information & Communications SE Title of Talk: The need for speed – Data streaming in the Cloud with Kafka® Abstract: As Kubernetes is quickly becoming the de facto standard for the cloud operating system is Apache Kafka becoming the data streaming. Enterprise need more speed to get insights from fast growing data. Kafka and Kubernetes are a perfect team for these use cases. There are different options to run an Apache Kafka Cluster. Besides managed a Kafka cluster by the different cloud providers, running Kafka on Kubernetes is becoming more and more popular. We will introduce a setup, used components and recommendations from an own project with Kafka on Kubernetes. Finally we will share our lessons learned from this still evolving field

Apicurio Registry: Event-driven APIs & Schema governance for Apache Kafka | F...

HostedbyConfluent

With Apache Kafka’s rise for event-driven architectures, developers require a specification to design effective event-driven APIs. AsyncAPI has been developed based on OpenAPI to define the endpoints and schemas of brokers and topics. For Kafka applications, the broker’s design to handle high throughput serialized payloads brings challenges for consumers and producers managing the structure of the message. For this reason, a registry becomes critical to achieve schema governance. Apicurio Registry is an end-to-end solution to store API definitions and schemas for Kafka applications. The project includes serializers, deserializers, and additional tooling. The registry supports several types of artifacts including OpenAPI, AsyncAPI, GraphQL, Apache Avro, Google protocol buffers, JSON Schema, Kafka Connect schema, WSDL, and XML Schema (XSD). It also checks them for validity and compatibility. In this session, we will be covering the following topics: ● The importance of having a contract-first approach to event-driven APIs ● What is AsyncAPI, and how it helps to define Kafka endpoints and schemas ● The Kafka challenges on message structure when serializing and deserializing ● Introduction to Apicurio Registry and schema management for Kafka ● Examples of how to use Apicurio Registry with popular Java frameworks like Spring and Quarkus

Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...

Red Hat Developers

Apache Kafka is taking the world by storm and is rapidly becoming the de-facto event bus for event-driven and streaming applications that respond to events and data in real time. OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. In this session you will discover how Apache Kafka can be used in an IoT scenario to ingest data from devices and make them available in real-time to other applications. More specifically you will learn how to: Simulate devices that send MQTT messages to a MQTT broker Use Apache Camel and Camel-K to bridge MQTT with Apache Kafka Use Kafka Streams in a Quarkus application to process the device messages Query the state of the devices using GraphQ

From bytes to objects: describing your events | Dale Lane and Kate Stanley, IBM

HostedbyConfluent

Events stored in Kafka are just bytes, this is one of the reasons Kafka is so flexible. But when developing a producer or consumer you want objects, not bytes. Documenting and defining events provides a common way to discuss and agree on an approach to using Kafka. It also informs developers how to consume events without needing access to the developers responsible for producing events. This talk will introduce the most popular formats for documenting events that flow through Kafka, such as AsyncAPI, Avro, CloudEvents, JSON schemas, and Protobuf. It will discuss the differences between the approaches and how to decide on the documentation strategy for you. Alongside the formats, this session will also look at the tooling available for the different approaches. Tools for testing and code generation can make a big difference to your day-to-day developer experience. If you aren't already documenting your events or want to see other approaches, then this is the talk for you.

While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session we will share our lessons learned and top tips for building a Kafka Connect connector. We'll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.

Kubernetes Apache Kafka

confluent

Apicurio Registry: Event-driven APIs & Schema governance for Apache Kafka | F...

HostedbyConfluent

Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...

Red Hat Developers

From bytes to objects: describing your events | Dale Lane and Kate Stanley, IBM

HostedbyConfluent

Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...

Natan Silnitsky

At Wix, we have created a universal event-driven programming infrastructure on top of the Kafka message broker. This infra makes sure messages are eventually successfully consumed and produced no matter what failure it encounters. In this talk, you will learn about the features we introduced in order to make sure our distributed system can safely handle an ever growing message throughput in a fault tolerant manner. You will be introduced to such techniques as retry topics, local persistent queues, and cooperative fibers that help make your flows more resilient and performant. You will also learn how to make this infra work for all programming languages tech stacks with optimal resource manage using the power of Kubernetes and gRPC. When to use a client library, and when to deploy an external pod (DaemonSet) or even deploy a sidecar.

On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data

confluent

Watch this talk here: https://www.confluent.io/online-talks/building-a-streaming-etl-solution-with-apache-kafka-rail-data-on-demand As data engineers, we frequently need to build scalable systems working with data from a variety of sources and with various ingest rates, sizes, and formats. This talk takes an in-depth look at how Apache Kafka can be used to provide a common platform on which to build data infrastructure driving both real-time analytics as well as event-driven applications. Using a public feed of railway data it will show how to ingest data from message queues such as ActiveMQ with Kafka Connect, as well as from static sources such as S3 and REST endpoints. We'll then see how to use stream processing to transform the data into a form useful for streaming to analytics in tools such as Elasticsearch and Neo4j. The same data will be used to drive a real-time notifications service through Telegram. If you're wondering how to build your next scalable data platform, how to reconcile the impedance mismatch between stream and batch, and how to wrangle streams of data—this talk is for you!

From Zero to Hero with Kafka Connect

confluent

Confluent Cloud Networking | Rajan Sundaram, Confluent

HostedbyConfluent

Introduction to networking options available in Confluent Cloud Self Serve provisioning of confluent Kafka clusters. VPC Peering, VNet Peering, Transit Gateway and Private Link Options for AWS, GCP, Azure networking offering. Caveats of confluent's cloud networking solutions customers should be aware of. Details of two major pieces of the architecture of Confluent Cloud - Data Plane Network and Control Plane.

Reacting to an Event-Driven World (Kate Stanley & Grace Jansen, IBM) Kafka Su...

confluent

Developers are quickly moving to having Apache Kafka and events at the heart of their architecture. But how do you make sure your applications are resilient to the fluctuating load that comes with a never-ending stream of events? The Reactive Manifesto provides a good starting point for these kinds of problems. In this session explore how Kafka and reactive application architecture can be combined to better handle our modern event-streaming needs. We will explain why reactive applications are a great fit for Kafka and show an example of how to write a reactive producer and consumer.

Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...

HostedbyConfluent

At Stitch Fix, we hire Full Stack Data Scientists (150+) and expect them to perform diverse functions: from conception to modeling to implementation to measurement. Since Kafka is the way we get event data, this inevitably means that a Data Scientist will need to write a Kafka consumer if they’re going to complete their implementation work. E.g. to transform some client data into features, or perform a model prediction, or allocate someone to an A/B test, etc. In this talk I’ll go over how we built an opinionated Kafka client to easily enable Data Scientists to deploy and own production Kafka consumers, by focusing on writing python functions rather than fighting pitfalls with Kafka.

Open sourcing a successful internal project - Reversim 2021

Natan Silnitsky

About a year ago data streams team at Wix has released to open-source its Kafka client SDK wrapper called Greyhound. Greyhound offers rich functionality like message processing parallelisation and batching, various fault tolerant retry policies and much more. This talk will show how the team designed Greyhound with a layered architecture to allow both public and private parts and also different levels of flexible configuration. How it automatically syncs only relevant code from private repo to public one and also how it securely accepts public PRs back to the private repo. Outline: * Quick intro on what Greyhound is and its history at Wix * Greyhound layered architecture design to allow both public and private parts and also different levels of flexible configuration. * How it automatically syncs only relevant code from private repo to public one using Copybara tool * how it securely accepts public PRs back to the private repo.

Spring Cloud and Netflix OSS overview v1

Dmitry Skaredov

ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...

Nicola Ferraro

Practical tips and tricks for Apache Kafka messages integration | Francesco T...

HostedbyConfluent

Interacting with Apache Kafka seems straightforward at first, you “just” push and pull messages. Yet it can quickly become a source of frustration as the user encounters timeouts, vague error descriptions and disappearing messages. Experience helps a lot and I’m here to share what I know. In this talk you will learn the tips & tricks I wish I had known at the beginning of my Apache Kafka journey. We’ll discuss topics like producer acknowledgments, server and consumer parameters (auto_offset_reset anyone?) that are commonly overlooked causing lots of developer’s pain. I’ll share with you how to generate code that works as expected on the first run, making your first integration painless. These tips will kickstart your Apache Kafka experience in Python and save you hours of debugging.

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward

Operationalizing Machine Learning models is never easy. Our team at Comcast has been challenged with operationalizing predictive ML models to improve customer care experiences. Using Apache Flink we have been able to apply real-time streaming to all aspects of the Machine Learning lifecycle. This includes data feature exploration and preparation by data scientists, deploying live models to serve near-real-time predictions, and validating results for model retraining and iteration. We will share best practices and lessons learned from Flink’s role in our operationalized lifecycle including: • Executing as the “Prediction Pipeline” – a model container environment for near-real-time streaming and batch predictions • Preparing streaming features and data sets for model training, as input for production model predictions, and for a continually-updated customer context • Using connected streams and savepoints for “Live in the Dark”, multi-variant testing, and validation scenarios • Incorporating Flink’s Queryable State as an approach to the online “Feature Store” – a data catalog for reuse by multiple models and use cases • Enabling versioned models, versioned feature sets, and versioned data through DevOps approaches.

Multi-Clusters Made Easy with Liqo: Getting Rid of Your Clusters Keeping Them...

KCDItaly

Many companies are experiencing a dramatic increase in the number of their Kubernetes clusters, for reasons such as geographical/legislative constraints, data/service replication, etc. However, when the number of clusters increases, the complexity of deploying apps, managing the entire multi-cluster infrastructure, and keeping its state under control, becomes rapidly an unmanageable problem. A possible solution is Liqo, an open-source project that simplifies the creation of multi-cluster topologies by replicating the Kubernetes “cattle” model also to clusters. Liqo creates a virtual cluster that spans multiple real clusters, either on-prem or managed (AKS, EKS, GKE), and instantiates the desired applications seamlessly in the appropriate cluster. This talk will discuss the potentials and roadblocks of this vision and highlight how Liqo brings multi- cluster transparency to the users.

Knative goes  beyond serverless | Alexandre Roman

KCDItaly

Serverless is a good pattern when it comes to saving infrastructure resources: why should you run apps when there’s nothing to do? The open source project Knative is often used to run functions as serverless apps in Kubernetes clusters. In this talk, you’ll see how to leverage Knative for Kubernetes apps, not only functions. Check out how to apply serverless patterns to an existing Spring Boot / Nodejs app (backend / frontend) with a live demo.

Network Service Mesh

Prem Sankar Gopannan

Comprehensive container based service monitoring with kubernetes and istio

Fred Moyer

Exactly Once Delivery with Kafka - JOTB2020 Mini Session

Natan Silnitsky

In this talk I go over the basic theory of messaging in distributed systems, the different message delivery guarantees in Kafka and the to use them. I focus on exactly once delivery guarantees and the way Kafka implements it with transaction based messaging protocol. Including a discussion of the latency/throughput trade-offs, resource utilisation and its overall advantages and shortcomings. Finally, I show a use-case at Wix where exactly once delivery helped us solve a big problem.

WTF Do We Need a Service Mesh?

Anton Weiss

Service meshes are all the buzz in cloud-native world. How come only yesterday we didn't know such a thing existed and now everybody seems to want one? If you're already running a microservice-based system or only starting out with one — you may be asking yourself: "Do I also need a mesh?" In this session we'll try to answer what the mesh is good for, what problems it solves, what new questions it poses. More specifically we will: explore the SMI Spec; understand why everybody wants a mesh; see how the mesh helps with progressive delivery; discuss if it's time for you to get into the mesh.

Make Java Microservices Resilient with Istio - Mangesh - IBM - CC18

CodeOps Technologies LLP

This presentation was made by Mangesh Patankar (Developer Advocate - IBM Cloud) as part of Container Conference 2018: www.containerconf.in. "How do we make microservices resilient and fault-tolerant? How do we enforce policy decisions, such as fine-grained access control and rate limits? How do we enable timeouts/retries, health checks, etc.? A service-mesh architecture attempts to resolve these issues by extracting the common resiliency features needed by a microservices framework away from the applications and frameworks and into the platform itself. Istio provides an easy way to create this service mesh."

Kafka based Global Data Mesh at Wix

Natan Silnitsky

As your organization rapidly grows in scale, so do the amount of challenges. Growing scale comes in multiple dimensions - traffic, geographic presence, products portfolio, various technologies, amount of developers, etc. Coming up with an architecture that can handle all of the data flows in a universal, simple way is key. This talk is about Wix's Kafka based global data architecture and platform. How we made it very easy for Wix 2000 microservices to publish and subscribe to data, no matter where they are deployed in the world, or what technological stack they use. All the while offering various tools and features for adapting to growing scale and insuring high resilience.

Talking Traffic: Data in the Driver's Seat (Dominique Chanet, Klarrio) Kafka ...

confluent

In the “Talking Traffic Partnership” (https://www.talking-traffic.com/en) the Dutch Ministry of Infrastructure and Environment collaborates with several public and private parties to deliver up-to-date traffic information from a wide variety of data sources to road users via smartphones and personal or onboard navigation systems. KPN was selected as IT partner for Talking Traffic, and Klarrio was commissioned by KPN to build a platform that could: – Act as a secure streaming information exchange between the Talking Traffic partners. – Deliver personalized subsets of selected data streams to millions of connected client devices and applications in real time. In this talk we will walk you through the production platform, describing how our partners run containerized Kafka applications against a secured multi-tenant Kafka setup to amass a wealth of traffic information. We will show: – How we manage tenants, data streams, and access to streams. – How we protect the platform from rogue applications. – How data provenance is handled in the platform. – How tenants are given insight into the operation and performance of their Kafka applications. We then show how we created a scalable messaging layer on top of this information backbone, enabling us to disseminate relevant traffic information towards millions of connected road users over MQTT. We focus on: – How we deliver millions of individualized subsets of the data on Kafka with minimal data amplification. – How we implement MQTT features like wildcard subscriptions and retained messages. By the end of the session you will have learned a way to deploy Kafka in large-scale, multi-tenant environments, and how to quickly and securely stream data from shared internal Kafka topics to consumers outside of the platform.

Kafka on Kubernetes

CloudOps2005

Delivering Cloud-Native Data Pipelines with Kafka Connect on Kubernetes | Vik...

HostedbyConfluent

Getting data between systems, particularly at scale, is a common challenge faced by data engineers. Pipelines need to be reliable, flexible, and scalable, and without requiring us to write the same boilerplate code each time. Kafka Connect is a framework that provides scalable & fault-tolerant integration between Apache Kafka and other systems. It can be deployed on containers making it easy to scale for increased capacity, throughput, and resilience. We will give a short intro to Kafka Connect and container technologies before proceeding to a deep dive into practical applications. Attendees will learn about: * Real-world Kafka Connect pipelines. * How to build custom connector container images * Configuration, and orchestration of Kafka Connect pipelines with Kubernetes using GitOps.

What's hot

Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...

Natan Silnitsky

On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data

confluent

From Zero to Hero with Kafka Connect

confluent

Confluent Cloud Networking | Rajan Sundaram, Confluent

HostedbyConfluent

Reacting to an Event-Driven World (Kate Stanley & Grace Jansen, IBM) Kafka Su...

confluent

Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...

HostedbyConfluent

Open sourcing a successful internal project - Reversim 2021

Natan Silnitsky

Spring Cloud and Netflix OSS overview v1

Dmitry Skaredov

ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...

Nicola Ferraro

Practical tips and tricks for Apache Kafka messages integration | Francesco T...

HostedbyConfluent

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Flink Forward

Multi-Clusters Made Easy with Liqo: Getting Rid of Your Clusters Keeping Them...

KCDItaly

Knative goes  beyond serverless | Alexandre Roman

KCDItaly

Network Service Mesh

Prem Sankar Gopannan

Comprehensive container based service monitoring with kubernetes and istio

Fred Moyer

Exactly Once Delivery with Kafka - JOTB2020 Mini Session

Natan Silnitsky

WTF Do We Need a Service Mesh?

Anton Weiss

Make Java Microservices Resilient with Istio - Mangesh - IBM - CC18

CodeOps Technologies LLP

Kafka based Global Data Mesh at Wix

Natan Silnitsky

Talking Traffic: Data in the Driver's Seat (Dominique Chanet, Klarrio) Kafka ...

confluent

What's hot (20)

Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...

On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data

From Zero to Hero with Kafka Connect

Confluent Cloud Networking | Rajan Sundaram, Confluent

Reacting to an Event-Driven World (Kate Stanley & Grace Jansen, IBM) Kafka Su...

Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...

Open sourcing a successful internal project - Reversim 2021

Spring Cloud and Netflix OSS overview v1

ApacheCon NA - Apache Camel K: connect your Knative serverless applications w...

Practical tips and tricks for Apache Kafka messages integration | Francesco T...

Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...

Multi-Clusters Made Easy with Liqo: Getting Rid of Your Clusters Keeping Them...

Knative goes  beyond serverless | Alexandre Roman

Network Service Mesh

Comprehensive container based service monitoring with kubernetes and istio

Exactly Once Delivery with Kafka - JOTB2020 Mini Session

WTF Do We Need a Service Mesh?

Make Java Microservices Resilient with Istio - Mangesh - IBM - CC18

Kafka based Global Data Mesh at Wix

Talking Traffic: Data in the Driver's Seat (Dominique Chanet, Klarrio) Kafka ...

Similar to Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov and Michael Ng, Confluent) Kafka Summit NYC 2019

Kafka on Kubernetes

CloudOps2005

Delivering Cloud-Native Data Pipelines with Kafka Connect on Kubernetes | Vik...

HostedbyConfluent

The Awakening of the New Event-Driven (Beast) (Viktor Gamov, Confluent) Kafka...

confluent

Developers have long employed message queues to decouple subsystems and provide an approximation of asynchronous processing. However, these queuing systems don’t adequately deliver on the promise of event-driven architectures and often lead to the usage anti-patterns. The events are carrying both notification and state. This allows for developers and data engineers to event-driven systems. Developers benefit from the asynchronous communication that events enable between services, and data engineers benefit from the integration capabilities. In this talk, Viktor will discuss the concepts of events, their relevance to software and data engineers, as well as its power for effectively unifying architectures. You learn how stream processing makes sense in microservices and data integration projects. The talk concludes with a hands-on demonstration of these concepts in practice, using modern toolchain – Kotlin, Spring Boot and Apache Kafka!

I Don’t Always Test My Streams, But When I Do, I Do it in Production (Viktor ...

confluent

Testing stream processing applications (Kafka Streams and ksqlDB) isn’t always straightforward. You could run a simple topology manually and observe the results. But how about repeatable tests that you can run anytime, as part of a build without a Kafka cluster or Zookeeper? Luckily, Kafka Streams includes the TopologyTestDriver module (and ksqlDB includes test-runner) that allows you to do precisely that. After learning this, no doubt, your test coverage is sky-high! However, how will your stream processing application perform once deployed to production? You might depend on external resources such as databases, web services, and connectors. Viktor will start this talk covering the basics of unit testing of Kafka Streams applications using TopologyTestDriver. Viktor will also look at some popular open-source libraries for testing streams applications. Viktor demonstrates TestContainers, a Java library that provides lightweight, disposable instances of shared databases, Kafka clusters, and anything else that can run in a Docker container and how to use it for integration testing of processing applications! And lastly, Viktor will show ksqlDB’s test-runner to unit test your KSQL applications.

Testing Kafka containers with Testcontainers: There and back again with Vikto...

HostedbyConfluent

Did you ever wonder how your applications will behave once deployed to production? Sure, you have unit tests, and your test coverage is sky-high. However, you might depend on external resources like Apache Kafka® or Kafka Connect connectors, kSQL, etc. Moreover, without proper integration testing, you cannot be confident about the stability of your production environment. In this session, Viktor talks about Testcontainers, a library (that was initially created for JVM, now exists in many languages) that provides lightweight, disposable instances of shared databases, clusters, and anything else that can run in a Docker container! After a rapid-fire introduction to the core concepts of the containers how they can help improve integration testing, we’re going to zoom in to supported out-of-the-box containers. You will learn how to test the complex stacks like Apache Kafka®-based streaming platform (or even Confluent Cloud) and other components.

Crossing the Streams: Rethinking Stream Processing with KStreams and KSQL

confluent

(Viktor Gamov, Confluent) Kafka Summit SF 2018 All things change constantly! And dealing with constantly changing data at low latency is pretty hard. It doesn’t need to be that way. Apache Kafka, the de facto standard open source distributed stream processing system. Many of us know Kafka’s architectural and pub/sub API particulars. But that doesn’t mean we’re equipped to build the kind of real-time streaming data systems that the next generation of business requirements are going to demand. We need to get on board with streams! Viktor Gamov will introduce Kafka Streams and KSQL—an important recent addition to the Confluent Open Source platform that lets us build sophisticated stream processing systems with little to no code at all! They will talk about how to deploy stream processing applications and look at the actual working code that will bring your thinking about streaming data systems from the ancient history of batch processing into the current era of streaming data! P.S. No prior knowledge of Kafka Streams, KSQL or Ghostbusters needed!

Proxies, gateways, and meshes cloud connectivity patterns for developers

LibbySchulze

Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)

bridgetkromhout

Building Event-Driven Applications with Apache Kafka & Confluent Platform

confluent

Watch this talk here: https://www.confluent.io/online-talks/building-event-driven-applications-apache-kafka-and-confluent-platform Apache Kafka® has become the de facto technology for real-time event streaming. Confluent Platform, developed by the creators of Apache Kafka, is an event-streaming platform that enables the ingest and processing of massive amounts of data in real time. In this session, we will cover the easiest ways to start developing event-driven applications with Apache Kafka using Confluent Platform. We will also demo a contextual event-driven application built using our ecosystem of connectors, REST proxy, and a variety of native clients. View now to learn: -How to create Apache Kafka topics in minutes and process event streams in real time -Check the health of an Apache Kafka broker using Confluent Control Center -The latest enhancements to Confluent Platform that make it easier to run Apache Kafka at scale -How to use KSQL, streaming SQL for Apache Kafka, to process event streams in real time using simple SQL queries

Kubernetes Operability Tooling (LEAP 2019)

bridgetkromhout

What is Apache Kafka®?

confluent

Viktor Gamov, Confluent, Developer Advocate Apache Kafka is an open source distributed streaming platform that allows you to build applications and process events as they occur. Viktor Gamov (developer Advocate at Confluent) walks through how it works and important underlying concepts. As a real-time, scalable, and durable system, Kafka can be used for fault-tolerant storage as well as for other use cases, such as stream processing, centralized data management, metrics, log aggregation, event sourcing, and more. This talk will explain what a streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. https://www.meetup.com/Chennai-Kafka/events/269942117/

Stories from running Kafka on K8S.pdf

AvinashUpadhyaya3

Using CVMFS on a distributed Kubernetes cluster - The PRP Experience

Igor Sfiligoi

NAB Tech Talk

confluent

Kubernetes Operability Tooling (GOTO Chicago 2019)

bridgetkromhout

Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection ...

HostedbyConfluent

"As the demand for real-time data processing continues to grow, so too do the challenges associated with building production-ready applications that can handle large volumes of data and handle it quickly. In this talk, we will explore common problems faced when building real-time applications at scale, with a focus on a specific use case: detecting and responding to cyclist crashes. Using telemetry data collected from a fitness app, we’ll demonstrate how we used a combination of Apache Kafka and Python-based microservices running on Kubernetes to build a pipeline for processing and analyzing this data in real-time. We'll also discuss how we used machine learning techniques to build a model for detecting collisions and how we implemented notifications to alert family members of a crash. Our ultimate goal is to help you navigate the challenges that come with building data-intensive, real-time applications that use ML models. By showcasing a real-world example, we aim to provide practical solutions and insights that you can apply to your own projects. Key takeaways: - An understanding of the common challenges faced when building real-time applications at scale - Strategies for using Apache Kafka and Python-based microservices to process and analyze data in real-time - Tips for implementing machine learning models in a real-time application - Best practices for responding to and handling critical events in a real-time application"

Kubernetes-Native DevOps: For Apache Kafka® with Confluent

confluent

Viktor Gamov, Confluent, Developer Advocate Confluent Operator allows you to run Apache Kafka® on Kubernetes for simplified operations such as microservices communication, visibility and monitoring, upgrades, scaling, and cluster management built into the Kubernetes platform. Now, Confluent Operator is evolving into a Kubernetes-native, extensible approach to managing the complete cloud-native event streaming platform on Kubernetes. In this demo, Viktor Gamov (Developer Advocate, Confluent), highlights a typical Kafka on Kubernetes operations use case: fixing production issues with validation in a test environment. We'll demonstrate how the Confluent Operator's evolution empowers you to use a declarative spec to quickly deploy and manage your event streaming applications and the Confluent Platform. Recording to be available cnfl.io/meetup-hub https://www.meetup.com/Chennai-Kafka/events/276994551/

A Primer Towards Running Kafka on Top of Kubernetes.pdf

AvinashUpadhyaya3

Day 2 Kubernetes - Tools for Operability (QConSF)

bridgetkromhout

Deploying your first application with Kubernetes

OVHcloud

Similar to Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov and Michael Ng, Confluent) Kafka Summit NYC 2019 (20)