Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020

•Download as PPTX, PDF•

0 likes•3,018 views

"To provide exceptional customer experiences at scale, the data pipelines that can move data reliably across the systems and applications in real-time should be seamlessly scalable. For the past several years, we relied on Message Queue based data pipelines to facilitate the transfer of data across the applications. However, as the number of use cases that require real-time data transfer increased rapidly, it became difficult to scale the messaging platform. Moving to Kafka helped us to resolve the data pipeline scaling issues and reduce the Publisher/Subscriber on-boarding time from several weeks to a few days. To support the on-demand scaling of Kafka clusters, we run them on RedHat OpenShift, an Enterprise Kubernetes. While managing Kafka that handles critical financial events, we have learned some lessons and developed efficient strategies to manage production-grade Kafka clusters on OpenShift. In this talk, we will present: 1. Some of the challenges that we faced with Kafka on OpenShift and how we evolved our infrastructure to overcome them. 2. Share our experiences from operating Kafka clusters at Scale in Production. 3. Our strategy for performing automated Kafka deployment and rollback in OpenShift. 4. Explain our fail-over strategy using Confluent’s Replicator to ensure service availability during cluster failures."

Technology

Discover Kafka on OpenShift:
Processing Real-Time Financial Events at Scale
The opinions expressed in this presentation are those of the presenter, in their individual capacity, and not necessarily those of Discover.
Anvesh Samineni
Senior Software Engineer
Ehfaj Khan
Principal Software Engineer

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Kafka Deployment and Rollback Strategy in OpenShift
Multi-Cluster Replication Design and Failover Strategy
Agenda

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Page Cache
• Page Cache is shared
• Performance might vary
• Pods are dynamically provisioned
• Larger node, many pods
• Page Cache is dedicated
• Less performance variations
• Only Kafka pod is provisioned
• Smaller node, only Kafka pod

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Assigning Pods to Nodes (1/4)
Pod
Node Affinity
Kafka pod should go to Kafka node only
Node

Pod Anti-Affinity
How Kafka pods should be placed relative to one another
(Required: Do not schedule if a Kafka pod already exists in the node.)
Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Pod
Assigning Pods to Nodes (2/4)

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Pod Anti-Affinity
How Kafka pods should be placed relative to one another
(Preferred: Try to schedule Kafka pods across AZs.)
Assigning Pods to Nodes (3/4)
Pod

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Node
Pod
Assigning Pods to Nodes (4/4)
Taints and Tolerations
Other pod should NOT go to Kafka node

Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Disruptions:
• Draining nodes accidentally
• Deleting many pods at a time
Handling Disruptions
PodDisruptionBudget
Limit number of concurrent disruptions
(Example: minAvailable = 2)
Since 2 concurrent disruptions
lead to 1 Available Kafka Pod

Dedicated nodes for Kafka pods
Node Affinity
Kafka pod should go to Kafka node only
Pod Anti-Affinity
How Kafka pods should be placed relative to one another
• Required: Do not schedule if a Kafka pod already exists in the node.
• Preferred: Try to schedule Kafka pods across AZs.
Taints and Tolerations
Other pod should NOT go to Kafka node
PodDisruptionBudget
Limit number of concurrent disruptions
Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance
Summary

Kafka Deployment and Rollback Strategy in OpenShift
Repeat for every pod except Active Controller
(Starting from last one)
1. Delete the pod
2. Wait till URP=0
3. Once URP=0, delete next pod
1. Delete Active Controller Pod
2. Wait till URP=0
Identify the Active Controller Pod
(Upgrade last)
Deployment Strategy: onDelete
Deployment Strategy

Kafka Deployment and Rollback Strategy in OpenShift
Repeat below for every pod except
Active Controller:
(Starting from last one)
1. Delete the pod
2. Wait till URP=0
3. Once URP=0, delete next pod
One of the Pod fails to restart
Revert StatefulSet to the previous
version
Deployment Strategy: onDelete
Identify the Active Controller Pod
(Upgrade last)
Deployment Strategy: onDelete Repeat below for all upgraded pods
in the reverse order of upgrade:
(Start with pod that failed to restart)
1. Delete the pod
2. Wait till URP=0
3. Once URP=0, delete previous pod
Rollback Strategy

Multi-Cluster Replication Design and Failover Strategy
Replicator replicates:
• Topics
• Messages
• Consumer groups

Multi-Cluster Replication Design and Failover Strategy
During Failover:
• Flip to the bootstrap URL of secondary cluster
• Stop the Replicator

Multi-Cluster Replication Design and Failover Strategy
Important to Enable Failover
Monitor Replicator
• Connectors are running
• Provision sufficient tasks
• No replication lag
Centralized Schema Registry
Enable Timestamp Interceptors
• Allows subscription to continue in the secondary cluster where it left off in the primary cluster
• Consumer groups in the secondary cluster are created by the Replicator
Provision ACLs for producers and consumers in the secondary cluster

"As Kafka’s popularity grows, enterprises often find themselves deploying not a single cluster, but several clusters. This is usually because of scale, but can be driven by other needs such as activities in different geographies or compliance requirements. Whenever you have multiple clusters, the ability to move data between clusters is very valuable. It enables very different scenarios such as geo-replication, disaster recovery, hybrid cloud architectures, but also simply allows reusing data between applications. MirrorMaker is the Apache Kafka project’s solution for mirroring clusters and consists of 3 connectors for Connect. This tool allows creating all sorts of mirroring topologies by combining the connectors in different ways. In this session we will demonstrate how to use MirrorMaker by exploring two of the most common mirroring use cases: geo-replication and disaster recovery. For each we’ll explain the combination of connectors to use and highlight the key decision points and configurations that you should carefully consider. At the end of the session you will understand the capabilities of MirrorMaker and the process of building powerful mirroring scenarios with this tool."

32 ectoplasmaAntonio SSantos

Producer Performance Tuning for Apache Kafka

Jiangjie Qin

Amazon AWS basics needed to run a Cassandra Cluster in AWS

Jean-Paul Azar

Getting up to Speed with MirrorMaker 2 (Mickael Maison, IBM & Ryanne Dolan) K...

HostedbyConfluent

More and more Enterprises are relying on Apache Kafka to run their businesses. Cluster administrators need the ability to mirror data between clusters to provide high availability and disaster recovery. MirrorMaker 2, released recently as part of Kafka 2.4.0, allows you to mirror multiple clusters and create many replication topologies. Learn all about this awesome new tool and how to reliably and easily mirror clusters. We will first describe how MirrorMaker 2 works, including how it addresses all the shortcomings of MirrorMaker 1. We will also cover how to decide between its many deployment modes. Finally, we will share our experience running it in production as well as our tips and tricks to get a smooth ride.

[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개

OpenStack Korea Community

Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...

confluent

In the Apache Kafka world, there is such a great diversity of open source tools available (I counted over 50!) that it’s easy to get lost. Over the years I have dealt with Kafka, I have learned to particularly enjoy a few of them that save me a tremendous amount of time over performing manual tasks. I will be sharing my experience and doing live demos of my favorite Kafka tools, so that you too can hopefully increase your productivity and efficiency when managing and administering Kafka. Come learn about the latest and greatest tools for CLI, UI, Replication, Management, Security, Monitoring, and more!

At our OC DevOps Meetup, we invited Rami Al-Ghami, a Sr. Software engineer at Workday to deliver a presentation on a Hands-On Terraform Best Concepts and Best Practices. The software lifecycle does not end when the developer packages their code and makes it ready for deployment. The delivery of this code is an integral part of shipping a product. Infrastructure orchestration and resource configuration should follow a similar lifecycle (and process) to that of the software delivered on it. In this talk, Rami will discuss how to use Terraform to automate your infrastructure and software delivery.

Envoy and Kafka

Adam Kotwasinski

Full recorded presentation at https://www.youtube.com/watch?v=2UfAgCSKPZo for Tetrate Tech Talks on 2022/05/13. Envoy's support for Kafka protocol, in form of broker-filter and mesh-filter. Contents: - overview of Kafka (usecases, partitioning, producer/consumer, protocol); - proxying Kafka (non-Envoy specific); - proxying Kafka with Envoy; - handling Kafka protocol in Envoy; - Kafka-broker-filter for per-connection proxying; - Kafka-mesh-filter to provide front proxy for multiple Kafka clusters. References: - https://adam-kotwasinski.medium.com/deploying-envoy-and-kafka-8aa7513ec0a0 - https://adam-kotwasinski.medium.com/kafka-mesh-filter-in-envoy-a70b3aefcdef

Kafka At Scale in the Cloud

confluent

A comunicacao mediunica-eduardo_w

carlos freire

Kafka presentation

Mohammed Fazuluddin

A Deep Dive into Kafka Controller

confluent

Presentation at Strata Data Conference 2018, New York The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure. Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker. Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Amazon Web Services

"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity. The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."

Microservices, Kubernetes and Istio - A Great Fit!

Animesh Singh

Microservices and containers are now influencing application design and deployment patterns. Sixty percent of all new applications will use cloud-enabled continuous delivery microservice architectures and containers. Service discovery, registration, and routing are fundamental tenets of microservices. Kubernetes provides a platform for running microservices. Kubernetes can be used to automate the deployment of Microservices and leverage features such as Kube-DNS, Config Maps, and Ingress service for managing those microservices. This configuration works fine for deployments up to a certain size. However, with complex deployments consisting of a large fleet of microservices, additional features are required to augment Kubernetes.

Evolução em Dois Mundos - Primeira Parte - Capítulo XIX - Alma e Reencarnação...

Cynthia Castro

Monitoring_with_Prometheus_Grafana_TutorialTim Vaillancourt

OpenStack Quantum Intro (OS Meetup 3-26-12)

Dan Wendlandt

1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話

Yahoo!デベロッパーネットワーク

Diving into the Deep End - Kafka Connect

confluent

Dennis Wittekind, Confluent, Senior Customer Success Engineer Perhaps you have heard of Kafka Connect and think it would be a great fit in your application's architecture, but you like to know how things work before you propose them to your team? Perhaps you know enough Connect to be dangerous, but you haven't had the time to really understand all the moving pieces? This meetup talk is for you! We'll briefly introduce Connect to the uninitiated, and then jump in to underlying concepts and considerations you should make when running Connect in production! We'll even run a live demo! What could go wrong!? https://www.meetup.com/Saint-Louis-Kafka-meetup-group/events/272687113/

[OpenStack Days Korea 2016] Track1 - Monasca를 이용한 Cloud 모니터링

OpenStack Korea Community

Apache kafka

Srikrishna k

Kafka 101

Clement Demonchy

Room 3 - 2 - Trần Tuấn Anh - Defending Software Supply Chain Security in Bank...

Vietnam Open Infrastructure User Group

Grafana introduction

Rico Chen

Handle Large Messages In Apache Kafka

Jiangjie Qin

Like many other messaging systems, Kafka has put limit on the maximum message size. User will fail to produce a message if it is too large. This limit makes a lot of sense and people usually send to Kafka a reference link which refers to a large message stored somewhere else. However, in some scenarios, it would be good to be able to send messages through Kafka without external storage. At LinkedIn, we have a few use cases that can benefit from such feature. This talk covers our solution to send large message through Kafka without additional storage.

OpenShift In a Nutshell - Episode 03 - Infrastructure part I

Behnam Loghmani

High Availability for OpenStack

Kamesh Pemmaraju

The primary requirements for OpenStack based clouds (public, private or hybrid) is that they must be massively scalable and highly available. There are a number of interrelated concepts which make the understanding and implementation of HA complex. The potential for not implementing HA correctly would be disastrous. This session was presented at the OpenStack Meetup in Boston Feb 2014. We discussed interrelated concepts as a basis for implementing HA and examples of HA for MySQL, Rabbit MQ and the OpenStack APIs primarily using Keepalived, VRRP and HAProxy which will reinforce the concepts and show how to connect the dots.

What's hot

Apache Kafka Fundamentals for Architects, Admins and Developers

confluent

A Hands-on Introduction on Terraform Best Concepts and Best Practices

Nebulaworks

Envoy and Kafka

Adam Kotwasinski

Kafka At Scale in the Cloud

confluent

A comunicacao mediunica-eduardo_w

carlos freire

Kafka presentation

Mohammed Fazuluddin

A Deep Dive into Kafka Controller

confluent

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Amazon Web Services

Microservices, Kubernetes and Istio - A Great Fit!

Animesh Singh

Evolução em Dois Mundos - Primeira Parte - Capítulo XIX - Alma e Reencarnação...

Cynthia Castro

Monitoring_with_Prometheus_Grafana_TutorialTim Vaillancourt

OpenStack Quantum Intro (OS Meetup 3-26-12)

Dan Wendlandt

1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話

Yahoo!デベロッパーネットワーク

Diving into the Deep End - Kafka Connect

confluent

[OpenStack Days Korea 2016] Track1 - Monasca를 이용한 Cloud 모니터링

OpenStack Korea Community

Apache kafka

Srikrishna k

Kafka 101

Clement Demonchy

Room 3 - 2 - Trần Tuấn Anh - Defending Software Supply Chain Security in Bank...

Vietnam Open Infrastructure User Group

Grafana introduction

Rico Chen

Handle Large Messages In Apache Kafka

Jiangjie Qin

What's hot (20)

Apache Kafka Fundamentals for Architects, Admins and Developers

A Hands-on Introduction on Terraform Best Concepts and Best Practices

Envoy and Kafka

Kafka At Scale in the Cloud

A comunicacao mediunica-eduardo_w

Kafka presentation

A Deep Dive into Kafka Controller

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Microservices, Kubernetes and Istio - A Great Fit!

Evolução em Dois Mundos - Primeira Parte - Capítulo XIX - Alma e Reencarnação...

Monitoring_with_Prometheus_Grafana_Tutorial

OpenStack Quantum Intro (OS Meetup 3-26-12)

1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話

Diving into the Deep End - Kafka Connect

[OpenStack Days Korea 2016] Track1 - Monasca를 이용한 Cloud 모니터링

Apache kafka

Kafka 101

Room 3 - 2 - Trần Tuấn Anh - Defending Software Supply Chain Security in Bank...

Grafana introduction

Handle Large Messages In Apache Kafka

Similar to Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020

OpenShift In a Nutshell - Episode 03 - Infrastructure part I

Behnam Loghmani

High Availability for OpenStack

Kamesh Pemmaraju

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Lightbend

In this talk by Sean Glover, Principal Engineer at Lightbend, we will review how the Strimzi Kafka Operator, a supported technology in Lightbend Platform, makes many operational tasks in Kafka easy, such as the initial deployment and updates of a Kafka and ZooKeeper cluster. See the blog post containing the YouTube video here: https://www.lightbend.com/blog/running-kafka-on-kubernetes-with-strimzi-for-real-time-streaming-applications

Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS

Lightbend

Apache Kafka–part of Lightbend Fast Data Platform–is a distributed streaming platform that is best suited to run close to the metal on dedicated machines in statically defined clusters. For most enterprises, however, these fixed clusters are quickly becoming extinct in favor of mixed-use clusters that take advantage of all infrastructure resources available. In this webinar by Sean Glover, Fast Data Engineer at Lightbend, we will review leading Kafka implementations on DC/OS and Kubernetes to see how they reliably run Kafka in container orchestrated clusters and reduce the overhead for a number of common operational tasks with standard cluster resource manager features. You will learn specifically about concerns like: * The need for greater operational knowhow to do common tasks with Kafka in static clusters, such as applying broker configuration updates, upgrading to a new version, and adding or decommissioning brokers. * The best way to provide resources to stateful technologies while in a mixed-use cluster, noting the importance of disk space as one of Kafka’s most important resource requirements. * How to address the particular needs of stateful services in a model that natively favors stateless, transient services.

Stories from running Kafka on K8S.pdf

AvinashUpadhyaya3

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...

Athens Big Data

Streaming Processing with a Distributed Commit Log

Joe Stein

Deploying Kafka on DC/OS

Kaufman Ng

Failover-Apachecon-Asia-2022.pptx

DavidKjerrumgaard1

[Rakuten TechConf2014] [F-4] At Rakuten, The Rakuten OpenStack Platform and B...

Rakuten Group, Inc.

Kafka Explainaton

NguyenChiHoangMinh

Introduction to Kafka Connectors

Knoldus Inc.

Introduction to Kafka Connectors

Knoldus Inc.

Serverless design with Fn project

Siva Rama Krishna Chunduru

The Fn project is a container-native Apache 2.0 licensed serverless platform that you can run anywhere – on any cloud or on-premise. It’s easy to use, supports every programming language, and is extensible and performant. This YourStory-Oracle Developer Meetup covers various design aspects of Serverless for polyglot programming, implementation of Saga pattern, etc. It also emphasizes on the monitoring aspect of Fn project using Prometheus and Grafana

Openstack Summit HK - Ceph defacto - eNovance

eNovance

Die pacman nomaden opnfv summit 2016 berlin

Zhipeng Huang

Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)

Hien Nguyen Van

Fundamentals of Apache Kafka

Chhavi Parasher

Introduction to Kafka Connectors (Knolx).pptx

Knoldus Inc.

Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015

Belmiro Moreira

Similar to Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020 (20)

OpenShift In a Nutshell - Episode 03 - Infrastructure part I

High Availability for OpenStack

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS

Stories from running Kafka on K8S.pdf

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...

Streaming Processing with a Distributed Commit Log

Deploying Kafka on DC/OS

Failover-Apachecon-Asia-2022.pptx

[Rakuten TechConf2014] [F-4] At Rakuten, The Rakuten OpenStack Platform and B...

Kafka Explainaton

Introduction to Kafka Connectors

Serverless design with Fn project

Openstack Summit HK - Ceph defacto - eNovance

Die pacman nomaden opnfv summit 2016 berlin

Kubecon shanghai rook deployed nfs clusters over ceph-fs (translator copy)

Fundamentals of Apache Kafka

Introduction to Kafka Connectors (Knolx).pptx

Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015

More from confluent

Speed Wins: From Kafka to APIs in Minutes

confluent

Evolving Data Governance for the Real-time Streaming and AI Era

confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

confluent

Santander Stream Processing with Apache Flink

confluent

Unlocking the Power of IoT: A comprehensive approach to real-time insights

confluent

Workshop híbrido: Stream Processing con Flink

confluent

El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real. Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real. En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

confluent

Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace. In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms. You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes. Don't miss out on this opportunity to learn from industry experts and take your business to the next level.

AWS Immersion Day Mapfre - Confluent

confluent

La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.

Eventos y Microservicios - Santander TechTalk

confluent

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

confluent

Citi TechTalk Session 2: Kafka Deep Dive

confluent

Build real-time streaming data pipelines to AWS with Confluent

confluent

Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.

Q&A with Confluent Professional Services: Confluent Service Mesh

confluent

Citi Tech Talk: Event Driven Kafka Microservices

confluent

Confluent & GSI Webinars series - Session 3

confluent

An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities. It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks. This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.

Citi Tech Talk: Messaging Modernization

confluent

Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?

Citi Tech Talk: Data Governance for streaming and real time data

confluent

Confluent & GSI Webinars series: Session 2

confluent

Data In Motion Paris 2023

confluent

Vous apprendrez également à : • Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données • Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience • Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés

Confluent Partner Tech Talk with Synthesis

confluent

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes

Evolving Data Governance for the Real-time Streaming and AI Era

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

Santander Stream Processing with Apache Flink

Unlocking the Power of IoT: A comprehensive approach to real-time insights

Workshop híbrido: Stream Processing con Flink

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

AWS Immersion Day Mapfre - Confluent

Eventos y Microservicios - Santander TechTalk

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

Citi TechTalk Session 2: Kafka Deep Dive

Build real-time streaming data pipelines to AWS with Confluent

Q&A with Confluent Professional Services: Confluent Service Mesh

Citi Tech Talk: Event Driven Kafka Microservices

Confluent & GSI Webinars series - Session 3

Citi Tech Talk: Messaging Modernization

Citi Tech Talk: Data Governance for streaming and real time data

Confluent & GSI Webinars series: Session 2

Data In Motion Paris 2023

Confluent Partner Tech Talk with Synthesis

Recently uploaded

Uni Systems Copilot event_05062024_C.Vlachos.pdf

Uni Systems S.M.S.A.

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Nexer Digital

Monitoring Java Application Security with JDK Tools and JFR Events

Ana-Maria Mihalceanu

Microsoft - Power Platform_G.Aspiotis.pdf

Uni Systems S.M.S.A.

UiPath Test Automation using UiPath Test Suite series, part 5

DianaGray10

FIDO Alliance Osaka Seminar: Overview.pdf

FIDO Alliance

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Video Streaming: Then, Now, and in the Future

Alpen-Adria-Universität

In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.

Communications Mining Series - Zero to Hero - Session 1

DianaGray10

This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered: • Communication Mining Overview • Why is it important? • How can it help today’s business and the benefits • Phases in Communication Mining • Demo on Platform overview • Q/A

PCI PIN Basics Webinar from the Controlcase Team

ControlCase

By Design, not by Accident - Agile Venture Bolzano 2024

Pierluigi Pugliese

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

FIDO Alliance

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

91mobiles

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Aggregage

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf

Peter Spielvogel

Building better applications for business users with SAP Fiori. • What is SAP Fiori and why it matters to you • How a better user experience drives measurable business benefits • How to get started with SAP Fiori today • How SAP Fiori elements accelerates application development • How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities • How SAP Fiori paves the way for using AI in SAP apps

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

Neo4j

Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.

Recently uploaded (20)

Uni Systems Copilot event_05062024_C.Vlachos.pdf

Epistemic Interaction - tuning interfaces to provide information for AI support

DevOps and Testing slides at DASA Connect

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Monitoring Java Application Security with JDK Tools and JFR Events

Microsoft - Power Platform_G.Aspiotis.pdf

UiPath Test Automation using UiPath Test Suite series, part 5

FIDO Alliance Osaka Seminar: Overview.pdf

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Video Streaming: Then, Now, and in the Future

Communications Mining Series - Zero to Hero - Session 1

PCI PIN Basics Webinar from the Controlcase Team

By Design, not by Accident - Agile Venture Bolzano 2024

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

Generative AI Deep Dive: Advancing from Proof of Concept to Production

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf

The Art of the Pitch: WordPress Relationships and Sales

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020

1. Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale The opinions expressed in this presentation are those of the presenter, in their individual capacity, and not necessarily those of Discover. Anvesh Samineni Senior Software Engineer Ehfaj Khan Principal Software Engineer

2. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Kafka Deployment and Rollback Strategy in OpenShift Multi-Cluster Replication Design and Failover Strategy Agenda

3. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Page Cache • Page Cache is shared • Performance might vary • Pods are dynamically provisioned • Larger node, many pods • Page Cache is dedicated • Less performance variations • Only Kafka pod is provisioned • Smaller node, only Kafka pod

4. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Assigning Pods to Nodes (1/4) Pod Node Affinity Kafka pod should go to Kafka node only Node

5. Pod Anti-Affinity How Kafka pods should be placed relative to one another (Required: Do not schedule if a Kafka pod already exists in the node.) Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Pod Assigning Pods to Nodes (2/4)

6. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Pod Anti-Affinity How Kafka pods should be placed relative to one another (Preferred: Try to schedule Kafka pods across AZs.) Assigning Pods to Nodes (3/4) Pod

7. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Node Pod Assigning Pods to Nodes (4/4) Taints and Tolerations Other pod should NOT go to Kafka node

8. Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Disruptions: • Draining nodes accidentally • Deleting many pods at a time Handling Disruptions PodDisruptionBudget Limit number of concurrent disruptions (Example: minAvailable = 2) Since 2 concurrent disruptions lead to 1 Available Kafka Pod

9. Dedicated nodes for Kafka pods Node Affinity Kafka pod should go to Kafka node only Pod Anti-Affinity How Kafka pods should be placed relative to one another • Required: Do not schedule if a Kafka pod already exists in the node. • Preferred: Try to schedule Kafka pods across AZs. Taints and Tolerations Other pod should NOT go to Kafka node PodDisruptionBudget Limit number of concurrent disruptions Provisioning Kafka Infrastructure on OpenShift for High Availability and Performance Summary

10. Kafka Deployment and Rollback Strategy in OpenShift Repeat for every pod except Active Controller (Starting from last one) 1. Delete the pod 2. Wait till URP=0 3. Once URP=0, delete next pod 1. Delete Active Controller Pod 2. Wait till URP=0 Identify the Active Controller Pod (Upgrade last) Deployment Strategy: onDelete Deployment Strategy

11. Kafka Deployment and Rollback Strategy in OpenShift Repeat below for every pod except Active Controller: (Starting from last one) 1. Delete the pod 2. Wait till URP=0 3. Once URP=0, delete next pod One of the Pod fails to restart Revert StatefulSet to the previous version Deployment Strategy: onDelete Identify the Active Controller Pod (Upgrade last) Deployment Strategy: onDelete Repeat below for all upgraded pods in the reverse order of upgrade: (Start with pod that failed to restart) 1. Delete the pod 2. Wait till URP=0 3. Once URP=0, delete previous pod Rollback Strategy

12. Multi-Cluster Replication Design and Failover Strategy Replicator replicates: • Topics • Messages • Consumer groups

13. Multi-Cluster Replication Design and Failover Strategy During Failover: • Flip to the bootstrap URL of secondary cluster • Stop the Replicator

14. Multi-Cluster Replication Design and Failover Strategy Important to Enable Failover Monitor Replicator • Connectors are running • Provision sufficient tasks • No replication lag Centralized Schema Registry Enable Timestamp Interceptors • Allows subscription to continue in the secondary cluster where it left off in the primary cluster • Consumer groups in the secondary cluster are created by the Replicator Provision ACLs for producers and consumers in the secondary cluster

15. Thank You

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020

Similar to Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020 (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (Anvesh Samineni, Discover Financial) Kafka Summit 2020