Apache Kafka's Streams API lets us process messages from different topics with very low latency. Messages may have different formats, schemas and may even be serialised in different ways. What happens when an undesirable message comes in the flow? When an error occurs, real-time applications can't always wait for manual recovery and need to handle such failures. Kafka Streams lets you use a few techniques like sentinel value or dead letter queues-in this talk we'll see how. This talk will give an overview of different patterns and tools available in the Streams DSL API to deal with corrupted messages. Based on a real-life use case, it also includes valuable experiences from building and running Kafka Streams projects in production. The talk includes live coding and demonstrations.
Kafka's basic terminologies, its architecture, its protocol and how it works.
Kafka at scale, its caveats, guarantees and use cases offered by it.
How we use it @ZaprMediaLabs.
This document discusses exactly once semantics in Apache Kafka 0.11. It provides an overview of how Kafka achieved exactly once delivery between producers and consumers. Key points include:
- Kafka 0.11 introduced exactly once semantics with changes to support transactions and deduplication.
- Producers can write in a transactional fashion and receive acknowledgments of committed writes from brokers.
- Brokers store commit markers to track the progress of transactions and ensure no data loss during failures.
- Consumers can read from brokers in a transactional mode and receive data only from committed transactions, guaranteeing no duplication of records.
- This allows reliable message delivery semantics between producers and consumers with Kafka acting as
CloudNative Days Spring 2021 ONLINE キーノートでの発表資料です。
https://event.cloudnativedays.jp/cndo2021/talks/1071
本セッションでは、DockerとKubernetesのもつ基本的な機能の概要を、コンテナの仕組みをふまえつつイラストを用いて紹介していきます。一般にあまり焦点をあてて取り上げられることは多くありませんが、コンテナの作成や管理を担う低レベルなソフトウェア「コンテナランタイム」も本セッションの中心的なトピックのひとつです。
本セッションは、拙著「イラストで分かるDockerとKubernetes」(技術評論社)の内容を参考にしています。
https://www.amazon.co.jp/dp/4297118378
Running Kafka as a Native Binary Using GraalVM with Ozan GünalpHostedbyConfluent
"During development and automated tests, it is common to create Kafka clusters from scratch and run workloads against those short-lived clusters. Starting a Kafka broker typically takes several seconds, and those seconds add up to precious time and resources.
How about spinning up a Kafka broker in less than 0.2 seconds with less memory overhead? In this session, we will talk about kafka-native, which leverages GraalVM native image for compiling Kafka broker to native executable using Quarkus framework. After going through some implementation details, we will focus on how it can be used in a Docker container with Testcontainers to speed up integration testing of Kafka applications. We will finally discuss some current caveats and future opportunities of a native-compiled Kafka for cloud-native production clusters."
Kafka's basic terminologies, its architecture, its protocol and how it works.
Kafka at scale, its caveats, guarantees and use cases offered by it.
How we use it @ZaprMediaLabs.
This document discusses exactly once semantics in Apache Kafka 0.11. It provides an overview of how Kafka achieved exactly once delivery between producers and consumers. Key points include:
- Kafka 0.11 introduced exactly once semantics with changes to support transactions and deduplication.
- Producers can write in a transactional fashion and receive acknowledgments of committed writes from brokers.
- Brokers store commit markers to track the progress of transactions and ensure no data loss during failures.
- Consumers can read from brokers in a transactional mode and receive data only from committed transactions, guaranteeing no duplication of records.
- This allows reliable message delivery semantics between producers and consumers with Kafka acting as
CloudNative Days Spring 2021 ONLINE キーノートでの発表資料です。
https://event.cloudnativedays.jp/cndo2021/talks/1071
本セッションでは、DockerとKubernetesのもつ基本的な機能の概要を、コンテナの仕組みをふまえつつイラストを用いて紹介していきます。一般にあまり焦点をあてて取り上げられることは多くありませんが、コンテナの作成や管理を担う低レベルなソフトウェア「コンテナランタイム」も本セッションの中心的なトピックのひとつです。
本セッションは、拙著「イラストで分かるDockerとKubernetes」(技術評論社)の内容を参考にしています。
https://www.amazon.co.jp/dp/4297118378
Running Kafka as a Native Binary Using GraalVM with Ozan GünalpHostedbyConfluent
"During development and automated tests, it is common to create Kafka clusters from scratch and run workloads against those short-lived clusters. Starting a Kafka broker typically takes several seconds, and those seconds add up to precious time and resources.
How about spinning up a Kafka broker in less than 0.2 seconds with less memory overhead? In this session, we will talk about kafka-native, which leverages GraalVM native image for compiling Kafka broker to native executable using Quarkus framework. After going through some implementation details, we will focus on how it can be used in a Docker container with Testcontainers to speed up integration testing of Kafka applications. We will finally discuss some current caveats and future opportunities of a native-compiled Kafka for cloud-native production clusters."
Kafka is a distributed messaging system that allows for publishing and subscribing to streams of records, known as topics. Producers write data to topics and consumers read from topics. The data is partitioned and replicated across clusters of machines called brokers for reliability and scalability. A common data format like Avro can be used to serialize the data.
Kafka Tutorial - introduction to the Kafka streaming platformJean-Paul Azar
The document discusses Kafka, an open-source distributed event streaming platform. It provides an introduction to Kafka and describes how it is used by many large companies to process streaming data in real-time. Key aspects of Kafka explained include topics, partitions, producers, consumers, consumer groups, and how Kafka is able to achieve high performance through its architecture and design.
Orchestrator allows for easy MySQL failover by monitoring the cluster and promoting a new master when failures occur. Two test cases were demonstrated: 1) using a VIP and scripts to redirect connections during failover and 2) integrating with Proxysql to separate reads and writes and automatically redirect write transactions during failover while keeping read queries distributed. Both cases resulted in failover occurring within 16 seconds while maintaining application availability.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
Apache Kafka is an open-source distributed event streaming platform used for building real-time data pipelines and streaming apps. It was developed by LinkedIn in 2011 to solve problems with data integration and processing. Kafka uses a publish-subscribe messaging model and is designed to be fast, scalable, and durable. It allows both streaming and storage of data and acts as a central data backbone for large organizations.
Introducing Apache Kafka - a visual overview. Presented at the Canberra Big Data Meetup 7 February 2019. We build a Kafka "postal service" to explain the main Kafka concepts, and explain how consumers receive different messages depending on whether there's a key or not.
In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
고승범(peter.ko) / kakao corp.(인프라2팀)
---
카카오에서는 빅데이터 분석, 처리부터 모든 개발 플랫폼을 이어주는 솔루션으로 급부상한 카프카(kafka)를 전사 공용 서비스로 운영하고 있습니다. 전사 공용 카프카를 직접 운영하면서 경험한 트러블슈팅과 운영 노하우 등을 공유하고자 합니다. 특히 카프카를 처음 접하시는 분들이나 이미 사용 중이신 분들이 많이 궁금해하는 프로듀서와 컨슈머 사용 시의 주의점 등에 대해서도 설명합니다.
Kafka is an open-source message broker that provides high-throughput and low-latency data processing. It uses a distributed commit log to store messages in categories called topics. Processes that publish messages are producers, while processes that subscribe to topics are consumers. Consumers can belong to consumer groups for parallel processing. Kafka guarantees order and no lost messages. It uses Zookeeper for metadata and coordination.
Streaming Apps and Poison Pills: handle the unexpected with Kafka Streamsconfluent
Streaming Apps and Poison Pills: handle the unexpected with Kafka Streams, Loïc Divad, Software Engineer, Publicis Sapient Engineering
https://www.meetup.com/Lille-Kafka/events/272064179/
The document provides an overview of SOLID principles with examples in Ruby code. It discusses the Single Responsibility Principle (SRP), Open-Closed Principle (OCP), Liskov Substitution Principle (LSP), Interface Segregation Principle (ISP), and Dependency Inversion Principle (DIP). For each principle, it shows code that violates the principle and refactors the code to follow the principle. The document emphasizes that SOLID is difficult and requires practice, and violations often involve multiple principles. It encourages reading books to learn more about applying SOLID in code.
Kafka is a distributed messaging system that allows for publishing and subscribing to streams of records, known as topics. Producers write data to topics and consumers read from topics. The data is partitioned and replicated across clusters of machines called brokers for reliability and scalability. A common data format like Avro can be used to serialize the data.
Kafka Tutorial - introduction to the Kafka streaming platformJean-Paul Azar
The document discusses Kafka, an open-source distributed event streaming platform. It provides an introduction to Kafka and describes how it is used by many large companies to process streaming data in real-time. Key aspects of Kafka explained include topics, partitions, producers, consumers, consumer groups, and how Kafka is able to achieve high performance through its architecture and design.
Orchestrator allows for easy MySQL failover by monitoring the cluster and promoting a new master when failures occur. Two test cases were demonstrated: 1) using a VIP and scripts to redirect connections during failover and 2) integrating with Proxysql to separate reads and writes and automatically redirect write transactions during failover while keeping read queries distributed. Both cases resulted in failover occurring within 16 seconds while maintaining application availability.
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. This talk gives an introduction to Apache Kafka, a distributed messaging system. It will cover both how Kafka works, as well as how it is used at LinkedIn for log aggregation, messaging, ETL, and real-time stream processing.
Apache Kafka is an open-source distributed event streaming platform used for building real-time data pipelines and streaming apps. It was developed by LinkedIn in 2011 to solve problems with data integration and processing. Kafka uses a publish-subscribe messaging model and is designed to be fast, scalable, and durable. It allows both streaming and storage of data and acts as a central data backbone for large organizations.
Introducing Apache Kafka - a visual overview. Presented at the Canberra Big Data Meetup 7 February 2019. We build a Kafka "postal service" to explain the main Kafka concepts, and explain how consumers receive different messages depending on whether there's a key or not.
In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
고승범(peter.ko) / kakao corp.(인프라2팀)
---
카카오에서는 빅데이터 분석, 처리부터 모든 개발 플랫폼을 이어주는 솔루션으로 급부상한 카프카(kafka)를 전사 공용 서비스로 운영하고 있습니다. 전사 공용 카프카를 직접 운영하면서 경험한 트러블슈팅과 운영 노하우 등을 공유하고자 합니다. 특히 카프카를 처음 접하시는 분들이나 이미 사용 중이신 분들이 많이 궁금해하는 프로듀서와 컨슈머 사용 시의 주의점 등에 대해서도 설명합니다.
Kafka is an open-source message broker that provides high-throughput and low-latency data processing. It uses a distributed commit log to store messages in categories called topics. Processes that publish messages are producers, while processes that subscribe to topics are consumers. Consumers can belong to consumer groups for parallel processing. Kafka guarantees order and no lost messages. It uses Zookeeper for metadata and coordination.
Streaming Apps and Poison Pills: handle the unexpected with Kafka Streamsconfluent
Streaming Apps and Poison Pills: handle the unexpected with Kafka Streams, Loïc Divad, Software Engineer, Publicis Sapient Engineering
https://www.meetup.com/Lille-Kafka/events/272064179/
The document provides an overview of SOLID principles with examples in Ruby code. It discusses the Single Responsibility Principle (SRP), Open-Closed Principle (OCP), Liskov Substitution Principle (LSP), Interface Segregation Principle (ISP), and Dependency Inversion Principle (DIP). For each principle, it shows code that violates the principle and refactors the code to follow the principle. The document emphasizes that SOLID is difficult and requires practice, and violations often involve multiple principles. It encourages reading books to learn more about applying SOLID in code.
Python's "batteries included" philosophy means that it comes with an astonishing amount of great stuff. On top of that, there's a vibrant world of third-party libraries that help make Python even more wonderful. We'll go on a breezy, example-filled tour through some of my favorites, from treasures in the standard library to great third-party packages that I don't think I could live without, and we'll touch on some of the fuzzier aspects of the Python culture that make it such a joy to be part of.
The document reports 171 errors found in the CSS file for a website with the URL https://www.softwares.guru. It lists each error found, including file not found errors, property errors, parse errors, value errors, and invalid selector errors. Many of the errors occur in CSS rules defining styles for buttons, gradients, shadows, and a photo viewer component.
FooCodeChu - Services for Software Analysis, Malware Detection, and Vulnerabi...Silvio Cesare
Bugwise is a tool that detects bugs in binaries using decompilation and data flow analysis. It detects issues like use-after-free bugs, double free bugs, and unsafe calls to getenv(). It has scanned over 123,000 Debian binaries and reported 85 getenv() related bugs across 47 packages. The probability of a binary having a vulnerability is 0.00067, and the probability of a package having at least one vulnerable binary is 0.00255. Bugwise is based on strong theoretical underpinnings like data flow analysis and is extensible to detect more bug classes. The presenter aims to make more of their research public and get more people using their tools via their website.
The document discusses using presenters to organize controller logic for views. It shows moving logic for retrieving follower tweets, recent tweets, trends, and other data from the TweetsController index action into a Tweets::IndexPresenter class. The presenter initializes with the current user and handles data retrieval to clean up the controller.
Here are a few ways the UsersController could be refactored to better follow the Interface Segregation Principle:
1. Extract authentication/authorization logic into a separate AuthenticationController concern.
2. Extract user profile/account management logic into an AccountsController.
3. Extract activation/registration logic into a separate RegistrationController.
4. Create separate interfaces/controllers for different user roles like AdminUsersController vs RegularUsersController.
This avoids forcing all user-related actions onto a single controller, allowing each controller to focus on specific user workflows and responsibilities. Clients like regular users and admins would interact with specialized interfaces rather than depending on a monolithic UsersController.
Migrations allow you to define and manage changes to your database schema over time. The document discusses ActiveRecord migrations, which provide a way to iteratively improve the database schema by adding, removing, and changing tables and columns. It also covers generating and rolling back migrations, common migration methods like create_table and add_column, and using migrations to support models and testing.
The document discusses several ways to implement parallel features or versions including:
1. Substitution - Implementing a feature by substituting one codebase for another. For example, substituting a new hypervisor driver implementation.
2. Feature toggles - Implementing features that can be toggled on or off via configuration to gradually roll out changes.
3. Feature versioning - Implementing different versions of a feature or component side by side to support multiple versions simultaneously via an abstraction layer.
The document provides examples of each approach including substituting queue backends and implementing different network configuration versions via an abstraction layer. Overall the techniques allow adding new features or versions in parallel to existing codebases.
Anatoly Sharifulin presents on developing apps using Perl. He discusses creating an app called DLTTR that allows users to delete tweets in bulk using asynchronous queues and APIs. The app was built with Mojolicious, uses a server API, and stores data in MySQL. It has been successful with over 1 million tweets deleted and thousands of users. The talk highlights how Perl helped enable the creation of this cross-platform app that deletes tweets quickly and appropriately.
A survey of Ferrie\’s Virus Bulletin series on anti-unpacking techniques and an examination of these techniques (or lack) in prevalent malware families.
Presented at Virus Bulletin 2009.
http://www.virusbtn.com
A race condition is a logical vulnerability that can occur when multiple threads access and attempt to modify shared data simultaneously. It is easy to miss in source code but can lead to critical vulnerabilities, especially in web applications. A race condition occurs when thread execution is not synchronized properly and results may depend on the particular order in which threads execute. Some examples of race conditions in web applications include temporary files being deleted before being fully processed and concurrent funds transfers that could result in inaccurate account balances. Proper locking and synchronization of shared resources is needed to avoid race conditions in multithreaded applications.
The document discusses using model checking for efficient malware detection. It outlines some limitations of traditional anti-virus techniques like signature matching and code emulation. Model checking is proposed as an alternative approach that can check a program's behavior without executing it. Pushdown systems are identified as a suitable formalism for modeling programs since they can analyze a program's stack, which is important for malware detection. A temporal logic called CTPL is discussed for specifying malicious behaviors, but it is noted that CTPL cannot fully describe stacks. The solution proposed is to consider predicates over stacks when specifying malicious behaviors.
Introduction to R Short course Fall 2016Spencer Fox
The document provides instructions for an introductory R session, including downloading materials from a GitHub repository and opening an R project file. It outlines logging in, downloading an R project folder containing intro materials, and opening the project file in RStudio.
This document discusses advanced Java debugging using bytecode. It explains that bytecode is the low-level representation of Java programs that is executed by the Java Virtual Machine (JVM). It shows examples of decompiling Java source code to bytecode instructions and evaluating bytecode on a stack. Various bytecode visualization and debugging tools are demonstrated. Key topics like object-oriented aspects of bytecode and the ".class" file format are also covered at a high-level.
The document provides an introduction to Python and Zope, describing Python as a high-level and dynamic programming language well-suited for rapid application development, and Zope as a web application server framework built using Python that provides capabilities like a web server, database engine, search engine, and templating languages to help develop web applications. It discusses key features of Python like its interactive interpreter, data types, classes, inheritance, operator overloading, and containers, as well as components of Zope like its ZServer web server, ZODB database engine, ZCatalog search engine, DTML and ZPT templating languages, and typical file system layout.
This document discusses database migrations in Ruby on Rails. It explains that migrations allow changing the database schema over time by providing a way to modify the database structure without directly editing SQL. Migrations have up and down methods that allow changing the database structure and reverting changes. The rake tasks like rake db:migrate apply pending migrations to update the database schema.
Código Saudável => Programador Feliz - Rs on Rails 2010Plataformatec
Palestra do Rs On Rails, na qual demos algumas dicas de boas práticas para manter seu código mais limpo e ter absoluto controle da sua aplicação em produção.
CCM AlchemyAPI and Real-time AggregationVictor Anjos
An exploratory look into KairosDB (OpenTSDB) connected to Cassandra (CCM) and using AlchemyAPI for entity, topic and sentiment extraction.
Sprinkled in is a bit of Data Modeling, Truth Tables, Primary Keys, Partition Keys and Cluster Keys.
All written in Python!
Similar to Streaming Apps and Poison Pills: handle the unexpected with Kafka Streams (Loic Divad, Xebia France) Kafka Summit SF 2019 (20)
Building API data products on top of your real-time data infrastructureconfluent
This talk and live demonstration will examine how Confluent and Gravitee.io integrate to unlock value from streaming data through API products.
You will learn how data owners and API providers can document, secure data products on top of Confluent brokers, including schema validation, topic routing and message filtering.
You will also see how data and API consumers can discover and subscribe to products in a developer portal, as well as how they can integrate with Confluent topics through protocols like REST, Websockets, Server-sent Events and Webhooks.
Whether you want to monetize your real-time data, enable new integrations with partners, or provide self-service access to topics through various protocols, this webinar is for you!
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
In our exclusive webinar, you'll learn why event-driven architecture is the key to unlocking cost efficiency, operational effectiveness, and profitability. Gain insights on how this approach differs from API-driven methods and why it's essential for your organization's success.
Santander Stream Processing with Apache Flinkconfluent
Flink is becoming the de facto standard for stream processing due to its scalability, performance, fault tolerance, and language flexibility. It supports stream processing, batch processing, and analytics through one unified system. Developers choose Flink for its robust feature set and ability to handle stream processing workloads at large scales efficiently.
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
In today's data-driven world, the Internet of Things (IoT) is revolutionizing industries and unlocking new possibilities. Join Data Reply, Confluent, and Imply as we unveil a comprehensive solution for IoT that harnesses the power of real-time insights.
Workshop híbrido: Stream Processing con Flinkconfluent
El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real.
Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real.
En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace.
In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms.
You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes.
Don't miss out on this opportunity to learn from industry experts and take your business to the next level.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Eventos y Microservicios - Santander TechTalkconfluent
Durante esta sesión examinaremos cómo el mundo de los eventos y los microservicios se complementan y mejoran explorando cómo los patrones basados en eventos nos permiten descomponer monolitos de manera escalable, resiliente y desacoplada.
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
This document discusses networking options and best practices for Confluent Cloud. It provides an overview of public endpoints, private link, and peering options. It then discusses best practices for private networking architectures on Azure using hub-and-spoke and private link designs. Finally, it addresses networking considerations and challenges for Kafka Connect managed connectors, as well as planned enhancements for DNS peering and outbound private link support.
Purpose of the session is to have a dive into Apache, Kafka, Data Streaming and Kafka in the cloud
- Dive into Apache Kafka
- Data Streaming
- Kafka in the cloud
Build real-time streaming data pipelines to AWS with Confluentconfluent
Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
No matter whether you are migrating your Kafka cluster to Confluent Cloud, running a cloud-hybrid environment or are in a different situation where data protection and encryption of sensitive information is required, Confluent Service Mesh allows you to transparently encrypt your data without the need to make code changes to you existing applications.
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
Microservices have become a dominant architectural paradigm for building systems in the enterprise, but they are not without their tradeoffs. Learn how to build event-driven microservices with Apache Kafka
Confluent & GSI Webinars series - Session 3confluent
An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities.
It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks.
This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.
This document discusses moving to an event-driven architecture using Confluent. It begins by outlining some of the limitations of traditional messaging middleware approaches. Confluent provides benefits like stream processing, persistence, scalability and reliability while avoiding issues like lack of structure, slow consumers, and technical debt. The document then discusses how Confluent can help modernize architectures, enable new real-time use cases, and reduce costs through migration. It provides examples of how companies like Advance Auto Parts and Nord/LB have benefitted from implementing Confluent platforms.
This session will show why the old paradigm does not work and that a new approach to the data strategy needs to be taken. It aims to show how a Data Streaming Platform is integral to the evolution of a company’s data strategy and how Confluent is not just an integration layer but the central nervous system for an organisation
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
4. 4@loicmdivad @XebiaFr 4@loicmdivad @XebiaFr
> println(sommaire)
Incoming records may be corrupted, or cannot be
handled by the serializer / deserializer. These
records are referred to as “poison pills”
1. Log and Crash
2. Skip the Corrupted
3. Sentinel Value Pattern
4. Dead Letter Queue Pattern
11. 11@loicmdivad @XebiaFr
Really old systems receive raw bytes
directly from message queues
With Kafka (Connect and Streams)
we’d like to continuously transform
these messages
10100110111010101
Kafka Connect
Kafka Brokers
Exercise #1 - breakfast
12. 12@loicmdivad @XebiaFr
Really old systems receive raw bytes
directly from message queues
With Kafka (Connect and Streams)
we’d like to continuously transform
these messages
But we need a deserializer with
special decoder to understand each
event
What happens if we get a buggy
implementation of the deserializer?
10100110111010101
Kafka Connect
Kafka Brokers
Kafka Streams
Exercise #1 - breakfast
19. 19@loicmdivad @XebiaFr
Log and Crash
2019-04-17 03:43:12 macbook-de-lolo [ERROR] (LogAndFailExceptionHandler.java:39) - Exception caught during
Deserialization, taskId: 0_0, topic: input-food-order, partition: 0, offset: 109
Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1"
org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a
deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please
set the default.deserialization.exception.handler appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124)
...
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747)
Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of elements: decoded 0 but should have
decoded 268435712
at scodec.Attempt$Failure.require(Attempt.scala:108)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60)
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
20. 20@loicmdivad @XebiaFr
Log and Crash
2019-04-17 03:43:12 macbook-de-lolo [ERROR] (LogAndFailExceptionHandler.java:39) - Exception caught during
Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109
Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1"
org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a
deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please
set the default.deserialization.exception.handler appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124)
...
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747)
Caused by: java.lang.IllegalArgumentException: dishes: Insufficient number of
elements: decoded 0 but should have decoded 268435712
at scodec.Attempt$Failure.require(Attempt.scala:108)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60)
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
24. 24@loicmdivad @XebiaFr
▼ Change consumer group
▼ Manually update my offsets
▼ Reset my streaming app and set my auto reset to
LATEST
▽ $ kafka-streams-application-reset ...
▼ Destroy the topic, no message = no poison pill
▽ $ kafka-topics --delete --topic ...
▼ My favourite <3
▽ $ confluent destroy && confluent start
Don’t Do
▼ Fill an issue and suggest a fix to the tooling team
26. 26@loicmdivad @XebiaFr 26@loicmdivad @XebiaFr
Log and Crash
Like all consumers, Kafka Streams applications
deserialize messages from the broker.
The deserialization process can fail. It raises an
exception that cannot be caught by our code.
Buggy deserializers have to be fixed before the
application restarts, by default ...
31. 31@loicmdivad @XebiaFr
Skip the Corrupted
2019-04-17 03:43:12 macbook-de-lolo [ERROR] (LogAndFailExceptionHandler.java:39) - Exception caught during
Deserialization, taskId: 0_0, topic: exercise-breakfast, partition: 0, offset: 109
Exception in thread "answer-one-breakfast-0d808ce7-0ef1-44c6-808a-f594bc7fceae-StreamThread-1"
org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a
deserialization error. If you would rather have the streaming pipeline continue after a
deserialization error, please set the default.deserialization.exception.handler
appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:124)
...
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:711)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:747)
Caused by: java.lang.IllegalArgumentException: ... decoded 0 but should have decoded 268435712
at scodec.Attempt$Failure.require(Attempt.scala:108)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:22)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58)
at fr.xebia.ldi.ratatouille.serde.BreakfastDeserializer.deserialize(BreakfastDeserializer.scala:15)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60)
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
32. 32@loicmdivad @XebiaFr 32@loicmdivad @XebiaFr
public class LogAndFailExceptionHandler implements DeserializationExceptionHandler
/* ... */
public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler
/* ... */
33. 33@loicmdivad @XebiaFr
public class LogAndFailExceptionHandler implements DeserializationExceptionHandler
/* ... */
public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler
/* ... */
public interface DeserializationExceptionHandler extends Configurable {
DeserializationHandlerResponse handle(final ProcessorContext context,
final ConsumerRecord<byte[], byte[]> record,
final Exception exception);
enum DeserializationHandlerResponse {
CONTINUE(0, "CONTINUE"),
FAIL(1, "FAIL");
/* ... */
}
}
}
34. 34@loicmdivad @XebiaFr
public class LogAndFailExceptionHandler implements DeserializationExceptionHandler
/* ... */
public class LogAndContinueExceptionHandler implements DeserializationExceptionHandler
/* ... */
public interface DeserializationExceptionHandler extends Configurable {
DeserializationHandlerResponse handle(final ProcessorContext context,
final ConsumerRecord<byte[], byte[]> record,
final Exception exception);
enum DeserializationHandlerResponse {
CONTINUE(0, "CONTINUE"),
FAIL(1, "FAIL");
/* ... */
}
}
}
Take
Away
40. 40@loicmdivad @XebiaFr 40@loicmdivad @XebiaFr
Skip the Corrupted
All exceptions thrown by deserializers are caught by
a DeserializationExceptionHandler
A handler returns Fail or Continue
You can implement your own Handler
But the two handlers provided by the library are
really basic… let’s explore other methods
41. 41@loicmdivad @XebiaFr 41@loicmdivad @XebiaFr
All exceptions thrown by deserializers are caught by
a DeserializationExceptionHandler
A handler returns Fail or Continue
You can implement your own Handler
But the two handlers provided by the library are
really basic… let’s explore other methods
Skip the Corrupted
Take
Away
45. 45@loicmdivad @XebiaFr
We need to turn the deserialization process into a
pure transformation that cannot crash
To do so, we will replace corrupted message by a
sentinel value. It’s a special-purpose record (e.g: null,
None, Json.Null, etc ...)
Sentinel Value Pattern
f: G → H
G H
46. 46@loicmdivad @XebiaFr
We need to turn the deserialization process into a
pure transformation that cannot crash
To do so, we will replace corrupted message by a
sentinel value. It’s a special-purpose record (e.g: null,
None, Json.Null, etc ...)
This allows downstream processors to recognize and
handle such sentinel values
Sentinel Value Pattern
f: G → H
G H
G H
47. 47@loicmdivad @XebiaFr
We need to turn the deserialization process into a
pure transformation that cannot crash
To do so, we will replace corrupted message by a
sentinel value. It’s a special-purpose record (e.g: null,
None, Json.Null, etc ...)
This allows downstream processors to recognize and
handle such sentinel values
With Kafka Streams this can be achieved by
implementing a Deserializer
Sentinel Value Pattern
f: G → H
G H
G H
null
54. 54@loicmdivad @XebiaFr 54@loicmdivad @XebiaFr
Sentinel Value Pattern
By implementing a custom serde we can create a safe
Deserializer.
Downstreams now receive a sentinel value
indicating a deserialization error.
Errors can then be treated correctly, example:
monitoring the number of deserialization
errors with a custom metric
But we lost a lot of information about the error…
let’s see a last method
55. 55@loicmdivad @XebiaFr 55@loicmdivad @XebiaFr
Sentinel Value Pattern
By implementing a custom serde we can create a safe
Deserializer.
Downstreams now receive a sentinel value
indicating a deserialization error.
Errors can then be treated correctly, example:
monitoring the number of deserialization
errors with a custom metric
But we lost a lot of information about the error…
let’s see a last method
Take
Away
58. 58@loicmdivad @XebiaFr
Dead Letter Queue Pattern
In this method we will let the deserializer fail.
For each failure we will send a message to a topic
containing corrupted messages.
Each message will have the original content of the
input message (for reprocessing) and additional
meta data about the failure.
With Kafka Streams this can be achieved by
implementing a DeserializationExceptionHandler
Streaming
APP
dead letter queue
input topic output topic
60. 60@loicmdivad @XebiaFr
class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler {
override def handle(context: ProcessorContext,
record: ConsumerRecord[Array[Byte], Array[Byte]],
exception: Exception): DeserializationHandlerResponse = {
val producerRecord = new ProducerRecord(topic, /*same key, value and ts,*/ headers.asJava)
producer.send(producerRecord, /* Producer Callback */ )
DeserializationHandlerResponse.CONTINUE
}
61. 61@loicmdivad @XebiaFr
class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler {
var topic: String = _
var producer: KafkaProducer[Array[Byte], Array[Byte]] = _
override def configure(configs: util.Map[String, _]): Unit = ???
override def handle(context: ProcessorContext,
record: ConsumerRecord[Array[Byte], Array[Byte]],
exception: Exception): DeserializationHandlerResponse = {
val producerRecord = new ProducerRecord(topic, /*same key, value and ts,*/ headers.asJava)
producer.send(producerRecord, /* Producer Callback */ )
DeserializationHandlerResponse.CONTINUE
}
62. 62@loicmdivad @XebiaFr
class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler {
var topic: String = _
var producer: KafkaProducer[Array[Byte], Array[Byte]] = _
override def configure(configs: util.Map[String, _]): Unit = ???
override def handle(context: ProcessorContext,
record: ConsumerRecord[Array[Byte], Array[Byte]],
exception: Exception): DeserializationHandlerResponse = {
val headers = record.headers().toArray ++ Array[Header](
new RecordHeader("processing-time", ???),
new RecordHeader("hexa-datetime", ???),
new RecordHeader("error-message", ???),
...
)
val producerRecord = new ProducerRecord(topic, /*same key, value and ts,*/ headers.asJava)
producer.send(producerRecord, /* Producer Callback */ )
DeserializationHandlerResponse.CONTINUE
}
63. 63@loicmdivad @XebiaFr
Fill the headers with some meta data
01061696e0016536f6d6500000005736f6d65206f
Value message to hexa
Restaurant
description
Event date and time
Food order category
64. 64@loicmdivad @XebiaFr
class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler {
var topic: String = _
var producer: KafkaProducer[Array[Byte], Array[Byte]] = _
override def configure(configs: util.Map[String, _]): Unit = ???
override def handle(context: ProcessorContext,
record: ConsumerRecord[Array[Byte], Array[Byte]],
exception: Exception): DeserializationHandlerResponse = {
val headers = record.headers().toArray ++ Array[Header](
new RecordHeader("processing-time", ???),
new RecordHeader("hexa-datetime", ???),
new RecordHeader("error-message", ???),
...
)
val producerRecord = new ProducerRecord(topic, /*same key, value and ts,*/ headers.asJava)
producer.send(producerRecord, /* Producer Callback */ )
DeserializationHandlerResponse.CONTINUE
}
65. 65@loicmdivad @XebiaFr
class DeadLetterQueueFoodExceptionHandler() extends DeserializationExceptionHandler {
var topic: String = _
var producer: KafkaProducer[Array[Byte], Array[Byte]] = _
override def configure(configs: util.Map[String, _]): Unit = ???
override def handle(context: ProcessorContext,
record: ConsumerRecord[Array[Byte], Array[Byte]],
exception: Exception): DeserializationHandlerResponse = {
val headers = record.headers().toArray ++ Array[Header](
new RecordHeader("processing-time", ???),
new RecordHeader("hexa-datetime", ???),
new RecordHeader("error-message", ???),
...
)
val producerRecord = new ProducerRecord(topic, /*same key, value and ts,*/ headers.asJava)
producer.send(producerRecord, /* Producer Callback */ )
DeserializationHandlerResponse.CONTINUE
}
Take
Away
69. 69@loicmdivad @XebiaFr 69@loicmdivad @XebiaFr
Dead Letter Queue Pattern
You can provide your own implementation of
DeserializationExceptionHandler.
This lets you use the Producer API to write a
corrupted record directly to a quarantine topic.
Then you can manually analyse your corrupted
records
⚠Warning: This approach have side effects that are
invisible to the Kafka Streams runtime.
70. 70@loicmdivad @XebiaFr 70@loicmdivad @XebiaFr
Dead Letter Queue Pattern
You can provide your own implementation of
DeserializationExceptionHandler.
This lets you use the Producer API to write a
corrupted record directly to a quarantine topic.
Then you can manually analyse your corrupted
records
⚠Warning: This approach have side effects that are
invisible to the Kafka Streams runtime.
Take
Away
73. 73@loicmdivad @XebiaFr 73@loicmdivad @XebiaFr
Related Post
Kafka Connect Deep Dive – Error Handling and
Dead Letter Queues - by Robin Moffatt
Building Reliable Reprocessing and Dead Letter
Queues with Apache Kafka - by Ning Xia
Handling bad messages using Kafka's Streams API -
answer by Matthias J. Sax
74. 74@loicmdivad @XebiaFr 74@loicmdivad @XebiaFr
Conclusion
When using Kafka, deserialization is the
responsibility of the clients.
These internal errors are not easy to catch
When it’s possible, use Avro + Schema Registry
When it’s not possible, Kafka Streams applies
techniques to deal with serde errors:
- DLQ: By extending a ExceptionHandler
- Sentinel Value: By extending a Deserializer
76. 76@loicmdivad @XebiaFr 76@loicmdivad @XebiaFr
Images
Photo by rawpixel on Unsplash
Photo by João Marcelo Martins on Unsplash
Photo by Jordane Mathieu on Unsplash
Photo by Brooke Lark on Unsplash
Photo by Jakub Kapusnak on Unsplash
Photo by Melissa Walker Horn on Unsplash
Photo by Aneta Pawlik on Unsplash