A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020

•

2 likes•3,597 views

Running applications across two data centers is a requirement for many industries. Understanding how to deploy and architect a Kafka Streams application for multiple data centers can seem daunting for both developers and operators. Both stretch clusters and replication present unique challenges. This talk will go over best practices and answer questions such as, should I replicate internal topics? What are the implications of exactly once semantics? Do I need to run active/active or active/passive? How do I minimize recovery time after a failure? We’ll discuss important issues for stretch clusters such as rack/dc placement of internal topic partitions, state store gotchas and common latency vs throughput trade offs. The patterns presented will enable you to confidently design and execute resilient Kafka Streams applications.

Technology

1
@jbfletch_
Kafka Streams
Resiliency
Anna McDonald: Twitter: @jbfletch_ <- fully committed to the underscore
GitHub: jbfletch

2
@jbfletch_Why do I Care About
Resiliency?
● Regulations, Industry Standards
(BCS 239)

3
@jbfletch_Why do I Care About
Resiliency?
● Regulations, Industry Standards
(BCS 239)
● Sleep

4
@jbfletch_Why do I Care About
Resiliency?
● Regulations, Industry Standards
(BCS 239)
● Sleep
● Profit

6
@jbfletch_
Three Steps to Follow
1. Define your resiliency requirements

7
@jbfletch_
Three Steps to Follow
1. Define your resiliency requirements
2. Implement your infrastructure to
support those resilience requirements

8
@jbfletch_
Three Steps to Follow
1. Define your resiliency requirements
2. Implement your infrastructure to
support those resilience requirements
3. Equip your Kafka Streams application
to support the infrastructure design
you chose

9
@jbfletch_How Resilient do you
need to be?

12
@jbfletch_
● Recovery Time Objective (RTO): How long can I afford to be down?
● Recovery Point Objective (RPO): How much can I miss while I am
down?
RTO & RPO

13
@jbfletch_What Infrastructure
Options do we Have?
Replication Stretch

14
@jbfletch_
Replication and Kafka Streams..Active
Passive
Pros Cons
● Independent Clusters
● Potential for Less Produce
Latency

15
@jbfletch_
Replication and Kafka Streams..Active
Passive
Pros Cons
● Independent Clusters
● Potential for Less Produce
Latency
● No EOS
● Manual Failover
● Lag Possible
● Internal KStreams Topics Not
Replicated

16
@jbfletch_
Why Can’t I Replicate Internal Topics?
● Changelogs and output
topics may be out of sync
with each other since they
are replicated
asynchronously.
● In addition upstream
changelogs may lag behind
downstream, resulting in an
unexpected and altered
application state..

17
@jbfletch_
Replication and Kafka Streams..Active
Passive

18
@jbfletch_
Replication and Kafka Streams..Active
Passive

19
@jbfletch_
Pros Cons
Stretch Clusters and Kafka
Streams
● Preserves offsets
● #onecluster
● Recovery can be automatic
● Exactly once semantics are
possible

20
@jbfletch_
Pros Cons
Stretch Clusters and Kafka
Streams
● Preserves offsets
● #onecluster
● Recovery can be automatic
● Exactly once semantics are
possible
● Produce Latency
● No perfect answer for
recovery automation in a 2.5
DC set up

21
@jbfletch_
Stretch Clusters and Kafka Streams

22
@jbfletch_
Stretch Clusters and Kafka Streams

23
@jbfletch_
Parameter Name
Corresponding
Client
Default value Consider setting to
acks Producer acks=1 acks=all
replication.factor Streams 1
3, non stretch
-1, to inherit broker
defaults for stretch
min.insync.replicas Broker 1 2
retries Streams 0 Integer.MAX_VALUE
delivery.timeout.ms Producer 120000 Integer.MAX_VALUE
Kafka Streams Application
Settings

To manage the ever-increasing volume and velocity of data within your company, you have successfully made the transition from single machines and one-off solutions to large distributed stream infrastructures in your data center, powered by Apache Kafka. But what if one data center is not enough? I will describe building resilient data pipelines with Apache Kafka that span multiple data centers and points of presence, and provide an overview of best practices and common patterns while covering key areas such as architecture guidelines, data replication, and mirroring as well as disaster scenarios and failure handling.

Introduction to Kafka Cruise Control

Jiangjie Qin

Zero-Copy Event-Driven Servers with NettyDaniel Bimschas

Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...

confluent

Increasingly, organizations are relying on Kafka for mission critical use-cases where high availability and fast recovery times are essential. In particular, enterprise operators need the ability to quickly migrate applications between clusters in order to maintain business continuity during outages. In many cases, out-of-order or missing records are entirely unacceptable. MirrorMaker is a popular tool for replicating topics between clusters, but it has proven inadequate for these enterprise multi-cluster environments. Here we present MirrorMaker 2.0, an upcoming all-new replication engine designed specifically to provide disaster recovery and high availability for Kafka. We describe various replication topologies and recovery strategies using MirrorMaker 2.0 and associated tooling.

Stability Patterns for Microservices

pflueras

Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019

confluent

Cloud migration: it's practically a rite of passage for anyone who's built infrastructure on bare metal. When we migrated our 5-year-old Kafka deployment from the datacenter to GCP, we were faced with the task of making our highly mutable server infrastructure more cloud-friendly. This led to a surprising decision: we chose to run our Kafka cluster on Kubernetes. I'll share war stories from our Kafka migration journey, explain why we chose Kubernetes over arguably simpler options like GCP VMs, and present the lessons we learned while making our way toward a stable and self-healing Kubernetes deployment. I'll also go through some improvements in the more recent Kafka releases that make upgrades crucial for any Kafka deployment on immutable and ephemeral infrastructure. You'll learn what happens when you try to run one complex distributed system on top of another, and come away with some handy tricks for automating cloud cluster management, plus some migration pitfalls to avoid. And if you're not sure whether running Kafka on Kubernetes is right for you, our experiences should provide some extra data points that you can use as you make that decision.

Apache Flink in the Cloud-Native Era

Flink Forward

Flink Forward San Francisco 2022. This talk will take you on the long journey of Apache Flink into the cloud-native era. It started all the way from where Hadoop and YARN were the standard way of deploying and operating data applications. We're going to deep dive into the cloud-native set of principles and how they map to the Apache Flink internals and recent improvements. We'll cover fast checkpointing, fault tolerance, resource elasticity, minimal infrastructure dependencies, industry-standard tooling, ease of deployment and declarative APIs. After this talk you'll get a broader understanding of the operational requirements for a modern streaming application and where the current limits are. by David Moravek

Apache Kafka becoming the message bus to transfer huge volumes of data from various sources into Hadoop. It's also enabling many real-time system frameworks and use cases. Managing and building clients around Apache Kafka can be challenging. In this talk, we will go through the best practices in deploying Apache Kafka in production. How to Secure a Kafka Cluster, How to pick topic-partitions and upgrading to newer versions. Migrating to new Kafka Producer and Consumer API. Also talk about the best practices involved in running a producer/consumer. In Kafka 0.9 release, we’ve added SSL wire encryption, SASL/Kerberos for user authentication, and pluggable authorization. Now Kafka allows authentication of users, access control on who can read and write to a Kafka topic. Apache Ranger also uses pluggable authorization mechanism to centralize security for Kafka and other Hadoop ecosystem projects. We will showcase open sourced Kafka REST API and an Admin UI that will help users in creating topics, re-assign partitions, Issuing Kafka ACLs and monitoring Consumer offsets.

Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...

confluent

While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session I will share our lessons learned and top tips for building a Kafka Connect connector. I’ll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

Flink Forward

Flink Forward San Francisco 2022. To improve Amazon Alexa experiences and support machine learning inference at scale, we built an automated end-to-end solution for incremental model building or fine-tuning machine learning models through continuous learning, continual learning, and/or semi-supervised active learning. Customer privacy is our top concern at Alexa, and as we build solutions, we face unique challenges when operating at scale such as supporting multiple applications with tens of thousands of transactions per second with several dependencies including near-real time inference endpoints at low latencies. Apache Flink helps us transform and discover metrics in near-real time in our solution. In this talk, we will cover the challenges that we faced, how we scale the infrastructure to meet the needs of ML teams across Alexa, and go into how we enable specific use cases that use Apache Flink on Amazon Kinesis Data Analytics to improve Alexa experiences to delight our customers while preserving their privacy. by Aansh Shah

vSAN Beyond The Basics

Sumit Lahiri

Apache Flink: API, runtime, and project roadmap

Kostas Tzoumas

VMworld 2017 - Top 10 things to know about vSAN

Duncan Epping

Networking in Java with NIO and Netty

Constantine Slisenka

Kafka to the Maxka - (Kafka Performance Tuning)

DataWorks Summit

Kafka is becoming an ever more popular choice for users to help enable fast data and Streaming. Kafka provides a wide landscape of configuration to allow you to tweak its performance profile. Understanding the internals of Kafka is critical for picking your ideal configuration. Depending on your use case and data needs, different settings will perform very differently. Lets walk through performance essentials of Kafka. Let's talk about how your Consumer configuration, can speed up or slow down the flow of messages to Brokers. Lets talk about message keys, their implications and their impact on partition performance. Lets talk about how to figure out how many partitions and how many Brokers you should have. Let's discuss consumers and what effects their performance. How do you combine all of these choices and develop the best strategy moving forward? How do you test performance of Kafka? I will attempt a live demo with the help of Zeppelin to show in real time how to tune for performance.

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Flink Forward

Flink Forward San Francisco 2022. At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing. by Xiang Zhang & Pratyush Sharma & Xiaoman Dong

Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...

HostedbyConfluent

Apache Kafka is used as the primary message bus for propagating events and logs across Uber. In particular, it pairs with Apache Pinot, a real-time distributed OLAP datastore, to deliver real-time insights seconds after the messages produced to Kafka. One challenge we faced was to update existing data in Pinot with the changelog in Kafka, and deliver an accurate view in the real-time analytical results. For example, the financial dashboard can report gross booking with the corrected Ride fares. And restaurant owners can analyze the UberEats orders with their latest delivery status. Implementing upserts in an immutable real-time OLAP store like Pinot is nontrivial. We need to make architectural changes in how data is distributed via Kafka amongst the server nodes, how it's indexed and queried in a distributed fashion. In this talk I will discuss how we leveraged Kafka's partition-by-key feature to this end and how we added this ability in Pinot without any performance degradation.

OpenStack High Availability

Jakub Pavlik

Disaster Recovery Plans for Apache Kafka

confluent

Running Apache Kafka in production is only the first step in the Kafka operations journey. Professional Kafka users are ready to handle all possible disasters - because for most businesses having a disaster recovery plan is not optional. In this session, we’ll discuss disaster scenarios that can take down entire Kafka clusters and share advice on how to plan, prepare and handle these events. This is a technical session full of best practices - we want to make sure you are ready to handle the worst mayhem that nature and auditors can cause. Visit www.confluent.io for more information.

Apache Kafka Introduction

Amita Mirajkar

Apache Camel K - Copenhagen v2

Claus Ibsen

Microservices Network Architecture 101

Cumulus Networks

Database migrations with Flyway and Liquibase

Lars Östling

A Tale of 2(n) Data Centers: Tuning Apache Kafka Clusters to Combat Latency |...

HostedbyConfluent

When creating a stretch cluster the most common questions are usually, will this work with the latency between my sites and if so, what do I need to tune? In this session I’ll explain the most common levers we use to combat increased latency in stretch clusters. We will cover operating system level changes, broker side socket and buffer sizes, replication level tuning and touch on client optimizations. For each area I’ll dive into the three M’s. The Mentality (reason why we look at this), the Metric (what specific metric do we use to observe the impact of our changes and the Measure (what is the sweet spot we are looking to find for each optimization) . At the end of our trek, you’ll be ready to roll out clusters that are tuned to combat latency for any workload you may need to run.

Threading Made Easy! A Busy Developer’s Guide to Kotlin Coroutines

Lauren Yew

Kotlin Coroutines is a powerful threading library for Kotlin, released by JetBrains in 2018. At The New York Times, we recently migrated our core libraries and parts of our News app from RxJava to Kotlin Coroutines. In this talk we’ll share lessons learned and best practices to understand, migrate to, and use Kotlin Coroutines & Flows. In this presentation, you will learn: What Coroutines are and how they function How to use Kotlin Coroutines & Flows (with real world examples and demos) Where and why you should use Coroutines & Flows in your app How to avoid the pitfalls of Coroutines Kotlin Coroutines vs. RxJava Lessons learned from migrating to Kotlin Coroutines from RxJava in large legacy projects & libraries By the end of this talk, you will be able to apply Kotlin Coroutines to your own app, run the provided sample code yourself, and convince your team to give Kotlin Coroutines a try!

What's hot

A visual introduction to Apache Kafka

Paul Brebner

From my sql to postgresql using kafka+debezium

Clement Demonchy

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning

confluent

Kafka Streams: What it is, and how to use it?

confluent

Apache Kafka Best Practices

DataWorks Summit/Hadoop Summit

Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...

confluent

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

Flink Forward

vSAN Beyond The Basics

Sumit Lahiri

Apache Flink: API, runtime, and project roadmap

Kostas Tzoumas

VMworld 2017 - Top 10 things to know about vSAN

Duncan Epping

Networking in Java with NIO and Netty

Constantine Slisenka

Kafka to the Maxka - (Kafka Performance Tuning)

DataWorks Summit

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Flink Forward

Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...

HostedbyConfluent

OpenStack High Availability

Jakub Pavlik

Disaster Recovery Plans for Apache Kafka

confluent

Apache Kafka Introduction

Amita Mirajkar

Apache Camel K - Copenhagen v2

Claus Ibsen

Microservices Network Architecture 101

Cumulus Networks

Database migrations with Flyway and Liquibase

Lars Östling

What's hot (20)

A visual introduction to Apache Kafka

From my sql to postgresql using kafka+debezium

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning

Kafka Streams: What it is, and how to use it?

Apache Kafka Best Practices

Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

vSAN Beyond The Basics

Apache Flink: API, runtime, and project roadmap

VMworld 2017 - Top 10 things to know about vSAN

Networking in Java with NIO and Netty

Kafka to the Maxka - (Kafka Performance Tuning)

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...

OpenStack High Availability

Disaster Recovery Plans for Apache Kafka

Apache Kafka Introduction

Apache Camel K - Copenhagen v2

Microservices Network Architecture 101

Database migrations with Flyway and Liquibase

Similar to A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020

A Tale of 2(n) Data Centers: Tuning Apache Kafka Clusters to Combat Latency |...

HostedbyConfluent

Threading Made Easy! A Busy Developer’s Guide to Kotlin Coroutines

Lauren Yew

Introduction to Apache Kafka

Shiao-An Yuan

Reactive Thinking in Java with RxJava2

Yakov Fain

Continuous Deployment of Architectural Change

Matt Graham

Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx

nadirpervez2

Haj 4308-open jpa, eclipselink, and the migration toolkit

Kevin Sutter

Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects

ITD Systems

Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...

Natasha Rooney

Liquibase få kontroll på dina databasförändringar

Squeed

You never develop code without version control, why do you develop your database without it? With Liquibase, database changes are stored in human XML-files and committed to the source control system. Changes are applied to the developers local databases. As changes are committed they are distributed to all other environments including all developers local databases, test databases, staging databases, and even to production databases. This presentation will introduce you to Liquibase and the topic database change management. We will also present some advanced topics based on real life experience and a few tips and tricks as well Rikard Thulin, Squeed and Roger Nilsson, Altran

Flux architecture and Redux - theory, context and practice

Jakub Kocikowski

Flux Architecture changes how we think about data in frontend applications. In the talk I will cover the theory — what Flux Architecture is, what are the driving principles behind it and how it compares to other patterns in software developer landscape. And practice — what implementation decisions made Redux the most popular implementation of the pattern and do you need Redux to use Flux in your project. And finally I will try to answer the most important question: when will Flux add value to your project and when it just adds unnecessary complexity?

An Introduction to Reactive Application, Reactive Streams, and options for JVM

Steve Pember

The term “reactive” has lately become a buzzword, with a variety of definitions around the Web. When you hear reactive, what do you think of? Reactive Streams? The Reactive Manifesto? ReactJS? These terms may seem unrelated, but they share a common core concept. Reactive applications and reactive programming result in flexible, concise, performant code and are a superior alternative to the old standard thread-based imperative programming model. The reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in reactive terms due to how accustomed we’ve become to the imperative style. Stephen Pember explores the various definitions of reactive and reactive programming with the goal of providing techniques for building efficient, scalable applications. Steve dives into the key concepts of Reactive Streams and examines some sample implementations—including how ThirdChannel is currently using reactive libraries in production code. Steve looks at some of the open source options available in the JVM—including Reactor, RxJava, and Ratpack—giving attendees an idea of where to begin with the reactive ecosystem. If reactive is new to you, this should be an excellent introduction.

apachecamelk-april2019-190409093034.pdf

ssuserbb9f511

Solving Cross-Cutting Concerns in PHP - DutchPHP Conference 2016

Alexander Lisachenko

Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action

Paris Carbone

Large-scale data stream processing has come a long way to where it is today. It combines all the essential requirements of modern data analytics: subsecond latency, high throughput and impressively, strong consistency. Apache Flink is a system that serves as a proof-of-concept of these characteristics and it is mainly well-known for its lightweight fault tolerance. Data engineers and analysts can now let the system handle Terabytes of computational state without worrying about failures that can potentially occur. This presentation describes all the fundamental challenges behind exactly-once processing guarantees in large-scale streaming in a simple and intuitive way. Furthermore, it demonstrate the basic and extended versions of Flink's state-of-the-art snapshotting algorithm tailored to the needs of a dataflow graph.

Apache Camel K - Copenhagen

Claus Ibsen

Pluk2013 bodybuilding ratheesh

Ratheesh Kaniyala

Is It Fast? : Measuring MongoDB Performance

Tim Callaghan

InterConnect 2016, OpenJPA and EclipseLink Usage Scenarios (PEJ-5303)

Kevin Sutter

SFO15-110: Toolchain Collaboration

Linaro

SFO15-110: Toolchain Collaboration Speaker: Ryan Arnold Date: September 21, 2015 ★ Session Description ★ Linaro and its members discuss the work they are doing in the GNU & LLVM Toolchains for ARM processors. Work done in the previous six months will be discussed, and also discussions about the priorities each member is looking at for the next six months. ★ Resources ★ Video: https://www.youtube.com/watch?v=3BYl-1wGZg4 Presentation: http://www.slideshare.net/linaroorg/sfo15110-toolchain-collaboration Etherpad: pad.linaro.org/p/sfo15-110 Pathable: https://sfo15.pathable.com/meetings/302660 ★ Event Details ★ Linaro Connect San Francisco 2015 - #SFO15 September 21-25, 2015 Hyatt Regency Hotel http://www.linaro.org http://connect.linaro.org

Similar to A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020 (20)

A Tale of 2(n) Data Centers: Tuning Apache Kafka Clusters to Combat Latency |...

Threading Made Easy! A Busy Developer’s Guide to Kotlin Coroutines

Introduction to Apache Kafka

Reactive Thinking in Java with RxJava2

Continuous Deployment of Architectural Change

Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx

Haj 4308-open jpa, eclipselink, and the migration toolkit

Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects

Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...

Liquibase få kontroll på dina databasförändringar

Flux architecture and Redux - theory, context and practice

An Introduction to Reactive Application, Reactive Streams, and options for JVM

apachecamelk-april2019-190409093034.pdf

Solving Cross-Cutting Concerns in PHP - DutchPHP Conference 2016

Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action

Apache Camel K - Copenhagen

Pluk2013 bodybuilding ratheesh

Is It Fast? : Measuring MongoDB Performance

InterConnect 2016, OpenJPA and EclipseLink Usage Scenarios (PEJ-5303)

SFO15-110: Toolchain Collaboration

More from confluent

Speed Wins: From Kafka to APIs in Minutes

confluent

Evolving Data Governance for the Real-time Streaming and AI Era

confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

confluent

Santander Stream Processing with Apache Flink

confluent

Unlocking the Power of IoT: A comprehensive approach to real-time insights

confluent

Workshop híbrido: Stream Processing con Flink

confluent

El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real. Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real. En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

confluent

Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace. In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms. You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes. Don't miss out on this opportunity to learn from industry experts and take your business to the next level.

AWS Immersion Day Mapfre - Confluent

confluent

La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.

Eventos y Microservicios - Santander TechTalk

confluent

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

confluent

Citi TechTalk Session 2: Kafka Deep Dive

confluent

Build real-time streaming data pipelines to AWS with Confluent

confluent

Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.

Q&A with Confluent Professional Services: Confluent Service Mesh

confluent

Citi Tech Talk: Event Driven Kafka Microservices

confluent

Confluent & GSI Webinars series - Session 3

confluent

An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities. It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks. This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.

Citi Tech Talk: Messaging Modernization

confluent

Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?

Citi Tech Talk: Data Governance for streaming and real time data

confluent

Confluent & GSI Webinars series: Session 2

confluent

Data In Motion Paris 2023

confluent

Vous apprendrez également à : • Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données • Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience • Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés

Confluent Partner Tech Talk with Synthesis

confluent

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes

Evolving Data Governance for the Real-time Streaming and AI Era

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

Santander Stream Processing with Apache Flink

Unlocking the Power of IoT: A comprehensive approach to real-time insights

Workshop híbrido: Stream Processing con Flink

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

AWS Immersion Day Mapfre - Confluent

Eventos y Microservicios - Santander TechTalk

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

Citi TechTalk Session 2: Kafka Deep Dive

Build real-time streaming data pipelines to AWS with Confluent

Q&A with Confluent Professional Services: Confluent Service Mesh

Citi Tech Talk: Event Driven Kafka Microservices

Confluent & GSI Webinars series - Session 3

Citi Tech Talk: Messaging Modernization

Citi Tech Talk: Data Governance for streaming and real time data

Confluent & GSI Webinars series: Session 2

Data In Motion Paris 2023

Confluent Partner Tech Talk with Synthesis

Recently uploaded

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

Knowledge engineering: from people to machines and back

Elena Simperl

Elevating Tactical DDD Patterns Through Object Calisthenics

Dorra BARTAGUIZ

After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Product School

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

UiPathCommunity

💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™: See how to accelerate model training and optimize model performance with active learning Learn about the latest enhancements to out-of-the-box document processing – with little to no training required Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath. Speakers: 👨‍🏫 Andras Palfi, Senior Product Manager, UiPath 👩‍🏫 Lenka Dulovicova, Product Program Manager, UiPath

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

Tobias Schneck

As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other? Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.

Generating a custom Ruby SDK for your web service or Rails API using Smithy

g2nightmarescribd

Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

FIDO Alliance

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Inflectra

In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring. Learn about: • The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks. • Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective. • Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification. • Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process. Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.

PCI PIN Basics Webinar from the Controlcase Team

ControlCase

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...

Thierry Lestable

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

BookNet Canada

The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more. Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/ Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Assuring Contact Center Experiences for Your Customers With ThousandEyes

ThousandEyes

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

FIDO Alliance Osaka Seminar: Overview.pdf

FIDO Alliance

When stars align: studies in data quality, knowledge graphs, and machine lear...

Elena Simperl

Monitoring Java Application Security with JDK Tools and JFR Events

Ana-Maria Mihalceanu

Recently uploaded (20)

Epistemic Interaction - tuning interfaces to provide information for AI support

Knowledge engineering: from people to machines and back

Elevating Tactical DDD Patterns Through Object Calisthenics

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

Generating a custom Ruby SDK for your web service or Rails API using Smithy

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

GraphRAG is All You need? LLM & Knowledge Graph

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

PCI PIN Basics Webinar from the Controlcase Team

DevOps and Testing slides at DASA Connect

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Assuring Contact Center Experiences for Your Customers With ThousandEyes

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

FIDO Alliance Osaka Seminar: Overview.pdf

When stars align: studies in data quality, knowledge graphs, and machine lear...

Monitoring Java Application Security with JDK Tools and JFR Events

A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020

1. 1 @jbfletch_ Kafka Streams Resiliency Anna McDonald: Twitter: @jbfletch_ <- fully committed to the underscore GitHub: jbfletch

2. 2 @jbfletch_Why do I Care About Resiliency? ● Regulations, Industry Standards (BCS 239)

3. 3 @jbfletch_Why do I Care About Resiliency? ● Regulations, Industry Standards (BCS 239) ● Sleep

4. 4 @jbfletch_Why do I Care About Resiliency? ● Regulations, Industry Standards (BCS 239) ● Sleep ● Profit

5. 5 A Path to Resiliency @jbfletch_

6. 6 @jbfletch_ Three Steps to Follow 1. Define your resiliency requirements

7. 7 @jbfletch_ Three Steps to Follow 1. Define your resiliency requirements 2. Implement your infrastructure to support those resilience requirements

8. 8 @jbfletch_ Three Steps to Follow 1. Define your resiliency requirements 2. Implement your infrastructure to support those resilience requirements 3. Equip your Kafka Streams application to support the infrastructure design you chose

9. 9 @jbfletch_How Resilient do you need to be?

10. 10 @jbfletch_Recovery Time Objective

11. 11 @jbfletch_Recovery Point Objective

12. 12 @jbfletch_ ● Recovery Time Objective (RTO): How long can I afford to be down? ● Recovery Point Objective (RPO): How much can I miss while I am down? RTO & RPO

13. 13 @jbfletch_What Infrastructure Options do we Have? Replication Stretch

14. 14 @jbfletch_ Replication and Kafka Streams..Active Passive Pros Cons ● Independent Clusters ● Potential for Less Produce Latency

15. 15 @jbfletch_ Replication and Kafka Streams..Active Passive Pros Cons ● Independent Clusters ● Potential for Less Produce Latency ● No EOS ● Manual Failover ● Lag Possible ● Internal KStreams Topics Not Replicated

16. 16 @jbfletch_ Why Can’t I Replicate Internal Topics? ● Changelogs and output topics may be out of sync with each other since they are replicated asynchronously. ● In addition upstream changelogs may lag behind downstream, resulting in an unexpected and altered application state..

17. 17 @jbfletch_ Replication and Kafka Streams..Active Passive

18. 18 @jbfletch_ Replication and Kafka Streams..Active Passive

19. 19 @jbfletch_ Pros Cons Stretch Clusters and Kafka Streams ● Preserves offsets ● #onecluster ● Recovery can be automatic ● Exactly once semantics are possible

20. 20 @jbfletch_ Pros Cons Stretch Clusters and Kafka Streams ● Preserves offsets ● #onecluster ● Recovery can be automatic ● Exactly once semantics are possible ● Produce Latency ● No perfect answer for recovery automation in a 2.5 DC set up

21. 21 @jbfletch_ Stretch Clusters and Kafka Streams

22. 22 @jbfletch_ Stretch Clusters and Kafka Streams

23. 23 @jbfletch_ Parameter Name Corresponding Client Default value Consider setting to acks Producer acks=1 acks=all replication.factor Streams 1 3, non stretch -1, to inherit broker defaults for stretch min.insync.replicas Broker 1 2 retries Streams 0 Integer.MAX_VALUE delivery.timeout.ms Producer 120000 Integer.MAX_VALUE Kafka Streams Application Settings

24. 24 @jbfletch_

A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020

Similar to A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020 (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

A Tale of Two Data Centers: Kafka Streams Resiliency (Anna McDonald, Confluent) Kafka Summit 2020