Landoop presenting how to simplify your ETL process using Kafka Connect for (E) and (L). Introducing KCQL - the Kafka Connect Query Language & how it can simplify fast-data (ingress & egress) pipelines. How KCQL can be used to set up Kafka Connectors for popular in-memory and analytical systems and live demos with HazelCast, Redis and InfluxDB. How to get started with a fast-data docker kafka development environment. Enhance your existing Cloudera (Hadoop) clusters with fast-data capabilities.
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...confluent
In the Apache Kafka world, there is such a great diversity of open source tools available (I counted over 50!) that it’s easy to get lost. Over the years I have dealt with Kafka, I have learned to particularly enjoy a few of them that save me a tremendous amount of time over performing manual tasks. I will be sharing my experience and doing live demos of my favorite Kafka tools, so that you too can hopefully increase your productivity and efficiency when managing and administering Kafka. Come learn about the latest and greatest tools for CLI, UI, Replication, Management, Security, Monitoring, and more!
In this presentation we describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
"Just as the Apache Kafka Brokers provide JMX metrics to monitor your cluster's health, Kafka Streams provides a rich set of metrics for monitoring your application's health and performance. The metrics to observe for a given use-case of Kafka Streams will vary significantly from application to application. Learning how to build and customize monitoring of those applications will help you maintain a healthy Kafka Streams ecosystem.
Takeaways
* An analysis and overview of the provided metrics, including the new end-to-end metrics of Kafka Streams 2.7.
* See how to extract metrics from your application using existing JMX tooling.
* Walkthrough how to build a dashboard for observing those metrics.
* Explore options of how to add additional JMX resources and Kafka Stream metrics to your application.
* How to verify you built your dashboard correctly by creating a data control set to validate your dashboard.
* Go beyond what you can collect from the Kafka Stream metrics."
Event sourcing Live 2021: Streaming App Changes to Event StoreShivji Kumar Jha
This deck was used for the this talk at EventSourcing Live 2021 : https://lnkd.in/gbpshVA5
In the slides, we will go over identifying, capturing and delivering app changes to event stores. The event store can then be used as a data warehouse, data lake or a lakehouse. We will go over different ways to capture change data and deliver to an event store and the pros /cons of each.
Developing Secure Scala Applications With Fortify For ScalaLightbend
From banks to airlines to credit rating agencies, security continues to be a major focus for organizations across various industries. As the newspapers show, it’s heavily damaging to enterprises when security vulnerabilities in their code, infrastructure, or open source frameworks/libraries get exploited.
The good news is that your Scala development team now has a powerful ally for securing their applications. Co-developed by the Fortify team along with Lightbend, the upcoming Fortify for Scala Plugin is the only Static Application Security Testing (SAST) solution to use the official Scala compiler. This plugin automatically identifies code-level security vulnerabilities early in the SDLC, so you can confidently and reliably secure your mission-critical Scala-based applications.
In this webinar by Seth Tisue, Scala Committer and Senior Scala Engineer at Lightbend, and Poonam Yadav, Product Manager for Fortify at Micro Focus, you will learn about:
* Some of the more than 200 vulnerabilities that the Fortify plugin for Scala can catch and help you resolve,
* How the plugin works to analyze, identify and provide actionable recommendations,
* How to integrate it into your modern DevOps environment,
* Why this plugin was co-developed by Lightbend and the Fortify team, and how it benefits your organization’s security professionals / CISO office.
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaLightbend
Since its stable release in 2016, Akka Streams is quickly becoming the de facto standard integration layer between various Streaming systems and products. Enterprises like PayPal, Intel, Samsung and Norwegian Cruise Lines see this is a game changer in terms of designing Reactive streaming applications by connecting pipelines of back-pressured asynchronous processing stages.
This comes from the Reactive Streams initiative in part, which has been long led by Lightbend and others, allowing multiple streaming libraries to inter-operate between each other in a performant and resilient fashion, providing back-pressure all the way. But perhaps even more so thanks to the various integration drivers that have sprung up in the community and the Akka team—including drivers for Apache Kafka, Apache Cassandra, Streaming HTTP, Websockets and much more.
In this webinar for JVM Architects, Konrad Malawski explores the what and why of Reactive integrations, with examples featuring technologies like Akka Streams, Apache Kafka, and Alpakka, a new community project for building Streaming connectors that seeks to “back-pressurize” traditional Apache Camel endpoints.
* An overview of Reactive Streams and what it will look like in JDK 9, and the Akka Streams API implementation for Java and Scala.
* Introduction to Alpakka, a modern, Reactive version of Apache Camel, and its growing community of Streams connectors (e.g. Akka Streams Kafka, MQTT, AMQP, Streaming HTTP/TCP/FileIO and more).
* How Akka Streams and Akka HTTP work with Websockets, HTTP and TCP, with examples in both in Java and Scala.
Streaming Microservices With Akka Streams And Kafka StreamsLightbend
One of the most frequent questions that we get asked at Lightbend is “what’s the difference between Akka Streams and Kafka Streams?” After all, there is only a 1 letter difference between these two technologies, so how different could they be?
Well, as we see in this presentation, they are actually quite different. Both tools are part of the streaming Fast Data stack, but were created with entirely different technological approaches in mind. For example, While Akka Streams emerged as a dataflow-centric abstraction for the Akka Actor model, designed for general-purpose microservices, very low-latency event processing, and supports a wider class of application problems and third-party integrations via Alpakka, Kafka Streams is purpose-built for reading data from Kafka topics, processing it, and writing the results to new topics in a Kafka-centric way.
In this webinar by Dr. Dean Wampler, VP of Fast Data Engineering at Lightbend, we will:
* Discuss the strengths and weaknesses of Kafka Streams and Akka Streams for particular design needs in data-centric microservices
* Contrast them with Spark Streaming and Flink, which provide richer analytics over potentially huge data sets
* Help you map these streaming engines to your specific use cases, so you confidently pick the right ones for your jobs
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
With the introduction of connect and streams API in 2016, Apache Kafka is becoming the defacto solution for anyone looking to build a streaming platform. The community continues to add additional capabilities to make it the complete solution for streaming data.
Join us as we review the latest additions in Apache Kafka 0.10.2. In addition, we’ll cover what’s new in Confluent Enterprise 3.2 that makes it possible for running Kafka at scale.
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...confluent
In the Apache Kafka world, there is such a great diversity of open source tools available (I counted over 50!) that it’s easy to get lost. Over the years I have dealt with Kafka, I have learned to particularly enjoy a few of them that save me a tremendous amount of time over performing manual tasks. I will be sharing my experience and doing live demos of my favorite Kafka tools, so that you too can hopefully increase your productivity and efficiency when managing and administering Kafka. Come learn about the latest and greatest tools for CLI, UI, Replication, Management, Security, Monitoring, and more!
In this presentation we describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
"Just as the Apache Kafka Brokers provide JMX metrics to monitor your cluster's health, Kafka Streams provides a rich set of metrics for monitoring your application's health and performance. The metrics to observe for a given use-case of Kafka Streams will vary significantly from application to application. Learning how to build and customize monitoring of those applications will help you maintain a healthy Kafka Streams ecosystem.
Takeaways
* An analysis and overview of the provided metrics, including the new end-to-end metrics of Kafka Streams 2.7.
* See how to extract metrics from your application using existing JMX tooling.
* Walkthrough how to build a dashboard for observing those metrics.
* Explore options of how to add additional JMX resources and Kafka Stream metrics to your application.
* How to verify you built your dashboard correctly by creating a data control set to validate your dashboard.
* Go beyond what you can collect from the Kafka Stream metrics."
Event sourcing Live 2021: Streaming App Changes to Event StoreShivji Kumar Jha
This deck was used for the this talk at EventSourcing Live 2021 : https://lnkd.in/gbpshVA5
In the slides, we will go over identifying, capturing and delivering app changes to event stores. The event store can then be used as a data warehouse, data lake or a lakehouse. We will go over different ways to capture change data and deliver to an event store and the pros /cons of each.
Developing Secure Scala Applications With Fortify For ScalaLightbend
From banks to airlines to credit rating agencies, security continues to be a major focus for organizations across various industries. As the newspapers show, it’s heavily damaging to enterprises when security vulnerabilities in their code, infrastructure, or open source frameworks/libraries get exploited.
The good news is that your Scala development team now has a powerful ally for securing their applications. Co-developed by the Fortify team along with Lightbend, the upcoming Fortify for Scala Plugin is the only Static Application Security Testing (SAST) solution to use the official Scala compiler. This plugin automatically identifies code-level security vulnerabilities early in the SDLC, so you can confidently and reliably secure your mission-critical Scala-based applications.
In this webinar by Seth Tisue, Scala Committer and Senior Scala Engineer at Lightbend, and Poonam Yadav, Product Manager for Fortify at Micro Focus, you will learn about:
* Some of the more than 200 vulnerabilities that the Fortify plugin for Scala can catch and help you resolve,
* How the plugin works to analyze, identify and provide actionable recommendations,
* How to integrate it into your modern DevOps environment,
* Why this plugin was co-developed by Lightbend and the Fortify team, and how it benefits your organization’s security professionals / CISO office.
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaLightbend
Since its stable release in 2016, Akka Streams is quickly becoming the de facto standard integration layer between various Streaming systems and products. Enterprises like PayPal, Intel, Samsung and Norwegian Cruise Lines see this is a game changer in terms of designing Reactive streaming applications by connecting pipelines of back-pressured asynchronous processing stages.
This comes from the Reactive Streams initiative in part, which has been long led by Lightbend and others, allowing multiple streaming libraries to inter-operate between each other in a performant and resilient fashion, providing back-pressure all the way. But perhaps even more so thanks to the various integration drivers that have sprung up in the community and the Akka team—including drivers for Apache Kafka, Apache Cassandra, Streaming HTTP, Websockets and much more.
In this webinar for JVM Architects, Konrad Malawski explores the what and why of Reactive integrations, with examples featuring technologies like Akka Streams, Apache Kafka, and Alpakka, a new community project for building Streaming connectors that seeks to “back-pressurize” traditional Apache Camel endpoints.
* An overview of Reactive Streams and what it will look like in JDK 9, and the Akka Streams API implementation for Java and Scala.
* Introduction to Alpakka, a modern, Reactive version of Apache Camel, and its growing community of Streams connectors (e.g. Akka Streams Kafka, MQTT, AMQP, Streaming HTTP/TCP/FileIO and more).
* How Akka Streams and Akka HTTP work with Websockets, HTTP and TCP, with examples in both in Java and Scala.
Streaming Microservices With Akka Streams And Kafka StreamsLightbend
One of the most frequent questions that we get asked at Lightbend is “what’s the difference between Akka Streams and Kafka Streams?” After all, there is only a 1 letter difference between these two technologies, so how different could they be?
Well, as we see in this presentation, they are actually quite different. Both tools are part of the streaming Fast Data stack, but were created with entirely different technological approaches in mind. For example, While Akka Streams emerged as a dataflow-centric abstraction for the Akka Actor model, designed for general-purpose microservices, very low-latency event processing, and supports a wider class of application problems and third-party integrations via Alpakka, Kafka Streams is purpose-built for reading data from Kafka topics, processing it, and writing the results to new topics in a Kafka-centric way.
In this webinar by Dr. Dean Wampler, VP of Fast Data Engineering at Lightbend, we will:
* Discuss the strengths and weaknesses of Kafka Streams and Akka Streams for particular design needs in data-centric microservices
* Contrast them with Spark Streaming and Flink, which provide richer analytics over potentially huge data sets
* Help you map these streaming engines to your specific use cases, so you confidently pick the right ones for your jobs
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
With the introduction of connect and streams API in 2016, Apache Kafka is becoming the defacto solution for anyone looking to build a streaming platform. The community continues to add additional capabilities to make it the complete solution for streaming data.
Join us as we review the latest additions in Apache Kafka 0.10.2. In addition, we’ll cover what’s new in Confluent Enterprise 3.2 that makes it possible for running Kafka at scale.
Nowadays Akka is a popular choice for building distributed systems - there are a lot of case studies and successful examples in the industry.
But it still can be hard to switch to actor-based systems, because most of the tutorials and documentation don't show the way to assemble a real application using actors, especially in microservices environment.
Actor is a powerful abstraction in the message-driven environments, but it can be challenging to use familiar patterns and methodologies. At the same time, message-driven nature of actors is the biggest advantage that can be used for Reactive systems and microservices.
I want to share my experience and show how Domain-Driven Design and Enterprise Integration Patterns can be leveraged to design and build fine-grained microservices with synchronous and asynchronous communication. I'll focus on the core Akka functionality, but also explain how advanced features like Akka Persistence and Akka Cluster Sharding can be used together for achieving incredible results.
(Bill Bejeck, Confluent) Kafka Summit SF 2018
Apache Kafka added a powerful stream processing library in mid-2016, Kafka Streams, which runs on top of Apache Kafka. The community has embraced Kafka Streams with many early adopters, and the adoption rate continues to grow. Large to mid-size organizations have come to rely on Kafka Streams in their production environments. Kafka Streams has many advanced features to make applications more robust.
The point of this presentation is to show users of Kafka Streams some of the latest and greatest features, as well as some that may be advanced, that can make streams applications more resilient. The target audience for this talk are those users already comfortable writing Kafka Streams applications and want to go from writing their first proof-of-concept applications to writing robust applications that can withstand the rigor that running in a production environment demands.
The talk will be a technical deep dive covering topics like:
-Best practices on configuring a Kafka Streams application
-How to meet production SLAs by minimizing failover and recovery times: configuring standby tasks and the pros and cons of having standby replicas for local state
-How to improve resiliency and 24×7 operability: the use of different configurable error handlers, callbacks and how they can be used to see what’s going on inside the application
-How to achieve efficient scalability: a thorough review of the relationship between the number of instances, threads and state stores and how they relate to each other
While this is a technical deep dive, the talk will also present sample code so that attendees can view the concepts discussed in practice. Attendees of this talk will walk away with a deeper understanding of how Kafka Streams works, and how to make their Kafka Streams applications more robust and efficient. There will be a mix of discussion.
Building Out Your Kafka Developer CDC Ecosystemconfluent
Building Out Your Kafka Developer CDC Ecosystem, Neil Buesing, VP of Streaming Technologies for Object Partners (OPI)
Meetup Link: https://www.meetup.com/TwinCities-Apache-Kafka/events/272944023/
Kafka and Avro with Confluent Schema RegistryJean-Paul Azar
Covers how to use Kafka/Avro to send Records with support for schema and Avro serialization. Covers how to use Avro with Kafka and the confluent Schema Registry.
Real-time streaming and data pipelines with Apache KafkaJoe Stein
Get up and running quickly with Apache Kafka http://kafka.apache.org/
* Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
* Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers
* Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
* Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Revitalizing Enterprise Integration with Reactive StreamsLightbend
With Viktor Klang, Deputy CTO Lightbend, Inc.
As software grows more and more interconnected, and with several generations of software having to interoperate, a new take on the integration of systems is needed—ad hoc, unversioned, and unreplicated scripts just won’t suffice, and the traditional Enterprise Service Bus (ESB) concept has experienced stability, reliability, performance, and scalability problems.
In this webinar, Viktor explores a new take on Enterprise Integration Patterns:
First, he will explore the Reactive Streams standard, an orchestration layer where transformations are standalone, composable, reusable, and—most importantly—using asynchronous flow-control—back pressure—to maintain predictable, stable, behavior over time.
Furthermore, he will go through how one-off workloads relate to continuous, and batch, workloads, and how they can be addressed by that very same orchestration layer.
Finally, he will review how this type of design achieves resilience, scalability, and ultimately—responsiveness.
In this slide deck we show how to implement custom Kafka Serializer for Producer. We then show how failover works configuring when broker/topic config min.insync.replicas, and Producer config acks (0, 1, -1, none, leader, all).
Then tutorial show how to implement Kafka producer batching and compression. Then use Producer metrics API to see how batching and compression improves throughput. Then this tutorial covers using retires and timeouts, and tested that it works. It explains how the setup of max inflight messages and retry back off work and when to use and not use inflight messaging.
It goes on to who how to implement a ProducerInterceptor. Then lastly, it shows how to implement a custom Kafka partitioner to implement a priority queue for important records. Through many of the step by step examples, this tutorial shows how to use some of the Kafka tools to do replication verification, and inspect the topic partition leadership status.
Kafka Connect is a framework which connects Kafka with external Systems. It helps to move the data in and out of the Kafka. Connect makes it simple to use existing connector configuration for common source and sink Connectors.
Streaming Design Patterns Using Alpakka Kafka Connector (Sean Glover, Lightbe...confluent
Do you ever feel that your stream processor gets in the way of expressing business requirements? Most processors are frameworks, which are highly opinionated in the design and implementation of apps. Performing Complex Event Processing invariably leads to calling out to other technologies, but what if that integration didn’t require an RPC call or could be modeled into your stream itself? This talk will explore how to build rich domain, low latency, back-pressured, and stateful streaming applications that require very little infrastructure, using Akka Streams and the Alpakka Kafka connector.
We will explore how Alpakka Kafka maps to Kafka features in order to provide a comprehensive understanding of how to build a robust streaming platform. We’ll explore transactional message delivery, defensive consumer group rebalancing, stateful stages, and state durability/persistence. Akka Streams is built on top of Akka, an asynchronous messaging-driven middleware toolkit that can be used to build Erlang-like Actor Systems in Java or Scala. It is used as a JVM library to facilitate common streaming semantics within an existing or standalone application. It’s different from other stream processors in several ways. It natively supports back-pressure flow control inside a single JVM instance or across distributed systems to help prevent overloading downstream infrastructure. It’s perfect for modeling Complex Event Processing with its easy integration into existing apps and Akka Actor systems. Also, unlike most acyclic stream processors, Akka Streams can support sophisticated pipelines, or Graphs, by allowing the user to model cycles (loops) when there’s a need.
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
This is the first part of the presentation.
Here is the 2nd part of this presentation:-
http://www.slideshare.net/knoldus/introduction-to-apache-kafka-part-2
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesLightbend
The term 'streams' has been getting pretty overloaded recently–it's hard to know where to best use different technologies with streams in the name. In this talk by noted hAkker Konrad Malawski, we'll disambiguate what streams are and what they aren't, taking a deeper look into Akka Streams (the implementation) and Reactive Streams (the standard).
You'll be introduced to a number of real life scenarios where applying back-pressure helps to keep your systems fast and healthy at the same time. While the focus is mainly on the Akka Streams implementation, the general principles apply to any kind of asynchronous, message-driven architectures.
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...confluent
Kafka Streams is a library for developing applications for processing records from topics in Apache Kafka. It provides high-level Streams DSL and low-level Processor API for describing fault-tolerant distributed streaming pipelines in Java or Scala programming languages. Kafka Streams also offers elaborate API for stateless and stateful stream processing. That’s a high-level view of Kafka Streams. Have you ever wondered how Kafka Streams does all this and what the relationship with Apache Kafka (brokers) is? That’s among the topics of the talk.
During this talk we will look under the covers of Kafka Streams and deep dive into Kafka Streams’ Fault-Tolerant Distributed Stream Processing Engine. You will know the role of StreamThreads, TaskManager, StreamTasks, StandbyTasks, StreamsPartitionAssignor, RebalanceListener and few others. The aim of this talk is to get you equipped with knowledge about the internals of Kafka Streams that should help you fine-tune your stream processing pipelines for better performance.
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containersconfluent
Docker containers provide an ideal foundation for running Kafka-as-a-Service on-premises or in the public cloud. However, using Docker containers in production environments poses some challenges – including container management, scheduling, network configuration and security, and performance. In this session, we’ll share lessons learned from implementing Kafka-as-a-Service with Docker containers.
Presented at Kafka Summit SF 2017 by Nanda Vijaydev
From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and...Landoop Ltd
Presentation on "Big Data and Kafka, Kafka-Connect and the modern days of stream processing" For @Argos - @Accenture Development Technology Conference - London Science Museum (IMAX)
Landoop presentation in the Athens Big Data meetup, about streaming technologies on Apache Kafka. Introduction to the Lenses SQL engine and the Lenses platform and our open-source projects.
Nowadays Akka is a popular choice for building distributed systems - there are a lot of case studies and successful examples in the industry.
But it still can be hard to switch to actor-based systems, because most of the tutorials and documentation don't show the way to assemble a real application using actors, especially in microservices environment.
Actor is a powerful abstraction in the message-driven environments, but it can be challenging to use familiar patterns and methodologies. At the same time, message-driven nature of actors is the biggest advantage that can be used for Reactive systems and microservices.
I want to share my experience and show how Domain-Driven Design and Enterprise Integration Patterns can be leveraged to design and build fine-grained microservices with synchronous and asynchronous communication. I'll focus on the core Akka functionality, but also explain how advanced features like Akka Persistence and Akka Cluster Sharding can be used together for achieving incredible results.
(Bill Bejeck, Confluent) Kafka Summit SF 2018
Apache Kafka added a powerful stream processing library in mid-2016, Kafka Streams, which runs on top of Apache Kafka. The community has embraced Kafka Streams with many early adopters, and the adoption rate continues to grow. Large to mid-size organizations have come to rely on Kafka Streams in their production environments. Kafka Streams has many advanced features to make applications more robust.
The point of this presentation is to show users of Kafka Streams some of the latest and greatest features, as well as some that may be advanced, that can make streams applications more resilient. The target audience for this talk are those users already comfortable writing Kafka Streams applications and want to go from writing their first proof-of-concept applications to writing robust applications that can withstand the rigor that running in a production environment demands.
The talk will be a technical deep dive covering topics like:
-Best practices on configuring a Kafka Streams application
-How to meet production SLAs by minimizing failover and recovery times: configuring standby tasks and the pros and cons of having standby replicas for local state
-How to improve resiliency and 24×7 operability: the use of different configurable error handlers, callbacks and how they can be used to see what’s going on inside the application
-How to achieve efficient scalability: a thorough review of the relationship between the number of instances, threads and state stores and how they relate to each other
While this is a technical deep dive, the talk will also present sample code so that attendees can view the concepts discussed in practice. Attendees of this talk will walk away with a deeper understanding of how Kafka Streams works, and how to make their Kafka Streams applications more robust and efficient. There will be a mix of discussion.
Building Out Your Kafka Developer CDC Ecosystemconfluent
Building Out Your Kafka Developer CDC Ecosystem, Neil Buesing, VP of Streaming Technologies for Object Partners (OPI)
Meetup Link: https://www.meetup.com/TwinCities-Apache-Kafka/events/272944023/
Kafka and Avro with Confluent Schema RegistryJean-Paul Azar
Covers how to use Kafka/Avro to send Records with support for schema and Avro serialization. Covers how to use Avro with Kafka and the confluent Schema Registry.
Real-time streaming and data pipelines with Apache KafkaJoe Stein
Get up and running quickly with Apache Kafka http://kafka.apache.org/
* Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.
* Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers
* Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
* Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Revitalizing Enterprise Integration with Reactive StreamsLightbend
With Viktor Klang, Deputy CTO Lightbend, Inc.
As software grows more and more interconnected, and with several generations of software having to interoperate, a new take on the integration of systems is needed—ad hoc, unversioned, and unreplicated scripts just won’t suffice, and the traditional Enterprise Service Bus (ESB) concept has experienced stability, reliability, performance, and scalability problems.
In this webinar, Viktor explores a new take on Enterprise Integration Patterns:
First, he will explore the Reactive Streams standard, an orchestration layer where transformations are standalone, composable, reusable, and—most importantly—using asynchronous flow-control—back pressure—to maintain predictable, stable, behavior over time.
Furthermore, he will go through how one-off workloads relate to continuous, and batch, workloads, and how they can be addressed by that very same orchestration layer.
Finally, he will review how this type of design achieves resilience, scalability, and ultimately—responsiveness.
In this slide deck we show how to implement custom Kafka Serializer for Producer. We then show how failover works configuring when broker/topic config min.insync.replicas, and Producer config acks (0, 1, -1, none, leader, all).
Then tutorial show how to implement Kafka producer batching and compression. Then use Producer metrics API to see how batching and compression improves throughput. Then this tutorial covers using retires and timeouts, and tested that it works. It explains how the setup of max inflight messages and retry back off work and when to use and not use inflight messaging.
It goes on to who how to implement a ProducerInterceptor. Then lastly, it shows how to implement a custom Kafka partitioner to implement a priority queue for important records. Through many of the step by step examples, this tutorial shows how to use some of the Kafka tools to do replication verification, and inspect the topic partition leadership status.
Kafka Connect is a framework which connects Kafka with external Systems. It helps to move the data in and out of the Kafka. Connect makes it simple to use existing connector configuration for common source and sink Connectors.
Streaming Design Patterns Using Alpakka Kafka Connector (Sean Glover, Lightbe...confluent
Do you ever feel that your stream processor gets in the way of expressing business requirements? Most processors are frameworks, which are highly opinionated in the design and implementation of apps. Performing Complex Event Processing invariably leads to calling out to other technologies, but what if that integration didn’t require an RPC call or could be modeled into your stream itself? This talk will explore how to build rich domain, low latency, back-pressured, and stateful streaming applications that require very little infrastructure, using Akka Streams and the Alpakka Kafka connector.
We will explore how Alpakka Kafka maps to Kafka features in order to provide a comprehensive understanding of how to build a robust streaming platform. We’ll explore transactional message delivery, defensive consumer group rebalancing, stateful stages, and state durability/persistence. Akka Streams is built on top of Akka, an asynchronous messaging-driven middleware toolkit that can be used to build Erlang-like Actor Systems in Java or Scala. It is used as a JVM library to facilitate common streaming semantics within an existing or standalone application. It’s different from other stream processors in several ways. It natively supports back-pressure flow control inside a single JVM instance or across distributed systems to help prevent overloading downstream infrastructure. It’s perfect for modeling Complex Event Processing with its easy integration into existing apps and Akka Actor systems. Also, unlike most acyclic stream processors, Akka Streams can support sophisticated pipelines, or Graphs, by allowing the user to model cycles (loops) when there’s a need.
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
This is the first part of the presentation.
Here is the 2nd part of this presentation:-
http://www.slideshare.net/knoldus/introduction-to-apache-kafka-part-2
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesLightbend
The term 'streams' has been getting pretty overloaded recently–it's hard to know where to best use different technologies with streams in the name. In this talk by noted hAkker Konrad Malawski, we'll disambiguate what streams are and what they aren't, taking a deeper look into Akka Streams (the implementation) and Reactive Streams (the standard).
You'll be introduced to a number of real life scenarios where applying back-pressure helps to keep your systems fast and healthy at the same time. While the focus is mainly on the Akka Streams implementation, the general principles apply to any kind of asynchronous, message-driven architectures.
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...confluent
Kafka Streams is a library for developing applications for processing records from topics in Apache Kafka. It provides high-level Streams DSL and low-level Processor API for describing fault-tolerant distributed streaming pipelines in Java or Scala programming languages. Kafka Streams also offers elaborate API for stateless and stateful stream processing. That’s a high-level view of Kafka Streams. Have you ever wondered how Kafka Streams does all this and what the relationship with Apache Kafka (brokers) is? That’s among the topics of the talk.
During this talk we will look under the covers of Kafka Streams and deep dive into Kafka Streams’ Fault-Tolerant Distributed Stream Processing Engine. You will know the role of StreamThreads, TaskManager, StreamTasks, StandbyTasks, StreamsPartitionAssignor, RebalanceListener and few others. The aim of this talk is to get you equipped with knowledge about the internals of Kafka Streams that should help you fine-tune your stream processing pipelines for better performance.
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containersconfluent
Docker containers provide an ideal foundation for running Kafka-as-a-Service on-premises or in the public cloud. However, using Docker containers in production environments poses some challenges – including container management, scheduling, network configuration and security, and performance. In this session, we’ll share lessons learned from implementing Kafka-as-a-Service with Docker containers.
Presented at Kafka Summit SF 2017 by Nanda Vijaydev
From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and...Landoop Ltd
Presentation on "Big Data and Kafka, Kafka-Connect and the modern days of stream processing" For @Argos - @Accenture Development Technology Conference - London Science Museum (IMAX)
Landoop presentation in the Athens Big Data meetup, about streaming technologies on Apache Kafka. Introduction to the Lenses SQL engine and the Lenses platform and our open-source projects.
Kafka Tutorial: Streaming Data ArchitectureJean-Paul Azar
Kafka tutorial covers Java examples for Producers and Consumers. Also covers why Kafka is important and what Kafka is. Takes a look at the whole ecosystem around Kafka. Discusses low-level details about Kafka needed for successful deploys and performance tuning like batching, compression, partitioning, and replication.
This tutorial covers advanced consumer topics like custom deserializers, ConsumerRebalanceListener to rewind to a certain offset, manual assignment of partitions to implement a "priority queue", “at least once” message delivery semantics Consumer Java example, “at most once” message delivery semantics Consumer Java example, “exactly once” message delivery semantics Consumer Java example, and a lot more.
KSQL is a stream processing SQL engine, which allows stream processing on top of Apache Kafka. KSQL is based on Kafka Stream and provides capabilities for consuming messages from Kafka, analysing these messages in near-realtime with a SQL like language and produce results again to a Kafka topic. By that, no single line of Java code has to be written and you can reuse your SQL knowhow. This lowers the bar for starting with stream processing significantly.
KSQL offers powerful capabilities of stream processing, such as joins, aggregations, time windows and support for event time. In this talk I will present how KSQL integrates with the Kafka ecosystem and demonstrate how easy it is to implement a solution using KSQL for most part. This will be done in a live demo on a fictitious IoT sample.
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLScyllaDB
Event streaming applications unlock new benefits by combining various data feeds. However, getting actionable insights in a timely fashion has remained a challenge, as the data has been siloed in disparate systems. ksqlDB solves this by providing an interactive SQL interface that can seamlessly combine and transform data from various sources.
In this webinar, we will show how streaming queries of high throughput NoSQL systems can derive insights from various push/pull queries via ksqlDB's User-Defined Functions, Aggregate Functions and Table Functions.
Event streaming applications unlock new benefits by combining various data feeds. However, getting actionable insights in a timely fashion has remained a challenge, as the data has been siloed in disparate systems. ksqlDB solves this by providing an interactive SQL interface that can seamlessly combine and transform data from various sources.
In this webinar, we will show how streaming queries of high throughput NoSQL systems can derive insights from various push/pull queries via ksqlDB's User-Defined Functions, Aggregate Functions and Table Functions.
Watch this to learn:
Real-world examples of the benefits of using a streaming database like ksqlDB and seamlessly combining data from Kafka & Cassandra/Scylla (NoSQL).
The functionality of ksqlDB via push/pull queries and UDFs/UDAFs/UDTFs.
The ease with which data stored in a NoSQL database can be transformed using ksqlDB and then persisted back for long-term storage.
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
High level introduction to Kafka Connect and Kafka Streams, two components of the Apache Kafka open source framework. See the concepts, architecture and features.
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKai Wähner
Agenda:
Apache Kafka Ecosystem
Kafka Streams as Foundation for KSQL
Motivation for KSQL
KSQL Concepts
Live Demo #1 – Intro to KSQL
KSQL Architecture
Live Demo #2 - Clickstream Analysis
Building a User Defined Function (Example: Machine Learning)
Getting Started
###
The rapidly expanding world of stream processing can be daunting, with new concepts such as various types of time semantics, windowed aggregates, changelogs, and programming frameworks to master.
KSQL is an open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to simplify all this and make stream processing available to everyone. Even though it is simple to use, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood).
Benefits of using KSQL include No coding required; no additional analytics cluster needed; streams and tables as first-class constructs; access to the rich Kafka ecosystem. This session introduces the concepts and architecture of KSQL. Use cases such as Streaming ETL, Real-Time Stream Monitoring or Anomaly Detection are discussed. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLconfluent
Speaker: Robin Moffatt, Developer Advocate, Confluent
In this talk, we'll build a streaming data pipeline using nothing but our bare hands, the Kafka Connect API and KSQL. We'll stream data in from MySQL, transform it with KSQL and stream it out to Elasticsearch. Options for integrating databases with Kafka using CDC and Kafka Connect will be covered as well.
This is part 2 of 3 in Streaming ETL - The New Data Integration series.
Watch the recording: https://videos.confluent.io/watch/4cVXUQ2jCLgJNmg4kjCRqo?.
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...Kai Wähner
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Talk at Big Data Spain 2018 in Madrid).
- KSQL includes access to the rich Apache Kafka ecosystem and is suitable for various use cases, including Streaming ETL, Real Time Stream Monitoring and Anomaly Detection
- KSQL allows to realize stream processing without coding and without additional analytics cluster
Description:
The rapidly expanding world of stream processing can be daunting, with new concepts such as various types of time semantics, windowed aggregates, changelogs, and programming frameworks to master.
KSQL is an open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to simplify all this and make stream processing available to everyone. Even though it is simple to use, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood).
Benefits of using KSQL include: No coding required; no additional analytics cluster needed; streams and tables as first-class constructs; access to the rich Kafka ecosystem. This session introduces the concepts and architecture of KSQL. Use cases such as Streaming ETL, Real Time Stream Monitoring or Anomaly Detection are discussed. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
The rapidly expanding world of stream processing can be daunting. KSQL is an open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to make stream processing available to everyone. This session introduces the concepts, architecture, use cases and benefits of KSQL. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
The rapidly expanding world of stream processing can be daunting. KSQL is an open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to make stream processing available to everyone. This session introduces the concepts, architecture, use cases and benefits of KSQL. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.
Streaming ETL with Apache Kafka and KSQLNick Dearden
Companies new and old are all recognizing the importance of a low-latency, scalable, fault-tolerant data backbone - in the form of the Apache Kafka streaming platform. With Kafka developers can integrate multiple systems and data sources to enable low-latency analytics, event-driven architectures, and the population of downstream systems. What's more, these data pipelines can be built using configuration alone.
In this talk, we'll see how easy it is to capture a stream of data changes in real-time from a database such as MySQL into Kafka using the Kafka Connect framework and then use KSQL to filter, aggregate and join it to other data, and finally stream the results from Kafka out into multiple targets such as Elasticsearch and MySQL. All of this can be accomplished without a single line of Java code!
Apache Kafka, and the Rise of Stream ProcessingGuozhang Wang
For a long time, a substantial portion of data processing that companies did ran as big batch jobs. But businesses operate in real-time and the software they run is catching up. Today, processing data in a streaming fashion becomes more and more popular in many companies over the more "traditional" way of batch-processing big data sets available as a whole.
Spark (Structured) Streaming vs. Kafka StreamsGuido Schmutz
Independent of the source of data, the integration and analysis of event streams gets more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analyzed, often with many consumers or systems interested in all or part of the events. In this session we compare two popular Streaming Analytics solutions: Spark Streaming and Kafka Streams.
Spark is fast and general engine for large-scale data processing and has been designed to provide a more efficient alternative to Hadoop MapReduce. Spark Streaming brings Spark's language-integrated API to stream processing, letting you write streaming applications the same way you write batch jobs. It supports both Java and Scala.
Kafka Streams is the stream processing solution which is part of Kafka. It is provided as a Java library and by that can be easily integrated with any Java application.
This presentation shows how you can implement stream processing solutions with each of the two frameworks, discusses how they compare and highlights the differences and similarities.
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson
O'Reilly Webcast with Myself and Evan Chan on the new SNACK Stack (playoff of SMACK) with FIloDB: Scala, Spark Streaming, Akka, Cassandra, FiloDB and Kafka.
Building a High-Performance Database with Scala, Akka, and SparkEvan Chan
Here is my talk at Scala by the Bay 2016, Building a High-Performance Database with Scala, Akka, and Spark. Covers integration of Akka and Spark, when to use actors and futures, back pressure, reactive monitoring with Kamon, and more.
KSQL is an open source streaming SQL engine for Apache Kafka. Come hear how KSQL makes it easy to get started with a wide-range of stream processing applications such as real-time ETL, sessionization, monitoring and alerting, or fraud detection. We'll cover both how to get started with KSQL and some under-the-hood details of how it all works.
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingYaroslav Tkachenko
Activision Data team has been running a data pipeline for a variety of Activision games for many years. Historically we used a mix of micro-batch microservices coupled with classic Big Data tools like Hadoop and Hive for ETL. As a result, it could take up to 4-6 hours for data to be available to the end customers.
In the last few years, the adoption of data in the organization skyrocketed. We needed to de-legacy our data pipeline and provide near-realtime access to data in order to improve reporting, gather insights faster, power web and mobile applications. I want to tell a story about heavily leveraging Kafka Streams and Kafka Connect to reduce the end latency to minutes, at the same time making the pipeline easier and cheaper to run. We were able to successfully validate the new data pipeline by launching two massive games just 4 weeks apart.
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
Activision Data team has been running a data pipeline for a variety of Activision games for many years. Historically we used a mix of micro-batch microservices coupled with classic Big Data tools like Hadoop and Hive for ETL. As a result, it could take up to 4-6 hours for data to be available to the end customers.
In the last few years, the adoption of data in the organization skyrocketed. We needed to de-legacy our data pipeline and provide near-realtime access to data in order to improve reporting, gather insights faster, power web and mobile applications. I want to tell a story about heavily leveraging Kafka Streams and Kafka Connect to reduce the end latency to minutes, at the same time making the pipeline easier and cheaper to run. We were able to successfully validate the new data pipeline by launching two massive games just 4 weeks apart.
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...HostedbyConfluent
"Kafka Connect, the framework for building scalable and reliable data pipelines, has gained immense popularity in the data engineering landscape. This session will provide a comprehensive guide to creating Kafka connectors using Kotlin, a language known for its conciseness and expressiveness.
In this session, we will explore a step-by-step approach to crafting Kafka connectors with Kotlin, from inception to deployment using an simple use case. The process includes the following key aspects:
Understanding Kafka Connect: We'll start with an overview of Kafka Connect and its architecture, emphasizing its importance in real-time data integration and streaming.
Connector Design: Delve into the design principles that govern connector creation. Learn how to choose between source and sink connectors and identify the data format that suits your use case.
Building a Source Connector: We'll start with building a Kafka source connector, exploring key considerations, such as data transformations, serialization, deserialization, error handling and delivery guarantees. You will see how Kotlin's concise syntax and type safety can simplify the implementation.
Testing: Learn how to rigorously test your connector to ensure its reliability and robustness, utilizing best practices for testing in Kotlin.
Connector Deployment: go through the process of deploying your connector in a Kafka Connect cluster, and discuss strategies for monitoring and scaling.
Real-World Use Cases: Explore real-world examples of Kafka connectors built with Kotlin.
By the end of this session, you will have a solid foundation for creating and deploying Kafka connectors using Kotlin, equipped with practical knowledge and insights to make your data integration processes more efficient and reliable. Whether you are a seasoned developer or new to Kafka Connect, this guide will help you harness the power of Kafka and Kotlin for seamless data flow in your applications."
Similar to London Apache Kafka Meetup (Jan 2017) (20)
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
London Apache Kafka Meetup (Jan 2017)
1. Delivering Fast Data Systems with Kafka
LANDOOP
www.landoop.com
Antonios Chalkiopoulos
18/1/2017
2. @chalkiopoulos
Open Source contributor
Big Data projects in Media, Betting, Retail and
Investment Banks in London
Books
Author, Programming MapReduce with Scalding
Founder of Landoop
3. DevOps Big Data Scala
Automation Distributed Systems Monitoring
Hadoop Fast Data / Streams Kafka
5. KAFKA CONNECT
“a common framework
for allowing stream data flow
between kafka and other systems”
6. Data is produced from a source and consumed to a sink.
Data Source
KafkaConnect
KafkaConnect
KAFKA Data SinkData Source
KafkaConnect
KafkaConnect
KAFKA Data Sink
Stream processing
8. Developers don’t care about:
Move data to/from sink/source
Support delivery semantics
Offset Management
Serialization / de-serialization
Partitioning / Scalability
Fault tolerance / fail-over
Schema Registry integration
Developers care about:
Domain specific transformations
9. CONNECTORS
Kafka Connect’s framework allows developers to create connectors that
copy data to/from other systems just by writing configuration files and
submitting them to Connect with no code necessary
10. Connector configurations are key-value mappings
name connector’s unique name
connector.class connector’s java class
tasks.max maximum tasks to create
topics list of topics (to source or sink data)
11. Introducing a query language for the connectors
name connector’s unique name
connector.class connector’s java class
tasks.max maximum tasks to create
topics list of topics (to source or sink data)
query KCQL query specifies fields/actions for the target system
12. KCQL
Kafka Connect Query Language
is a SQL like syntax allowing streamlined configuration of Kafka Sink Connectors and then some more..
Example:
Project fields, rename or ignore them and further customise in plain text
INSERT INTO transactions SELECT field1 AS column1, field2 AS column2, field3 FROM TransactionTopic;
INSERT INTO audits SELECT * FROM AuditsTopic;
INSERT INTO logs SELECT * FROM LogsTopic AUTOEVOLVE;
INSERT INTO invoices SELECT * FROM InvoiceTopic PK invoiceID;
13. So while integrating
Kafka with in-memory
data grid, key-value,
document stores,
NoSQL, search etc
systems..
INSERT INTO $TARGET
SELECT *|columns(i.e col1,col2 | col1 AS column1,col2)
FROM $TOPIC_NAME
[ IGNORE columns ]
[ AUTOCREATE ]
[ PK columns ]
[ AUTOEVOLVE ]
[ BATCH = N ]
[ CAPITALIZE ]
[ INITIALIZE ]
[ PARTITIONBY cola[,colb] ]
[ DISTRIBUTEBY cola[,colb] ]
[ CLUSTERBY cola[,colb] ]
[ TIMESTAMP cola|sys_current ]
[ STOREAS $YOUR_TYPE([key=value, .....]) ]
[ WITHFORMAT TEXT|AVRO|JSON|BINARY|OBJECT|MAP ]
KCQL
How does it look like?
14. Topic to target mapping
Field selection
Auto creation
Auto evolution
Error policies
Multiple KCQLs / topic
- Field extraction
- Access to Key & Metadata
Why KCQL ?
Thank you very much for coming today. I will be delivering a talk about building Fast Data systems with Kafka
My name is Antonios. I’ve been involved with open-source projects on distributed systems of the Hadoop eco-system, and currently, i’m having Apache Kafka in my heart :)
I have authored a book on MapReduce using Scalding and co-authored another one
Landoop is a company starting-up and focusing on DevOps, Distributed Systems and particularly Apache Kafka
Today i’d like to start the presentation with Kafka Connect. I guess most of you are already familiar with it, so will give a quick overview
Kafka Connect was introduced almost one year ago, as a feature of Apache Kafka 0.9+ with the narrow (although very important) scope of copying streaming data from and to a Kafka cluster. I found the concept really interesting and decided to experiment with it to see what this framework introduces.
Kafka Connect is part of the Apache Kafka project, open source under the Apache license, and ships with Kafka. It’s a framework for building connectors between other data systems and Kafka, and the associated runtime to run these connectors in a distributed, fault tolerant manner at scale.
The announcement by confluent
https://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines/
And this is how Kafka Connect fits into the picture on a Kafka based system.
You would normally use a stream processing framework to transform your data streams i.e. Spark, K-Streams, etc
And what Kafka Connect offers is the separation of concerns. It can simplify the key stages of the ETL process, and using simple tools, we can build and maintain distributed streaming data pipeline.
The E (the extraction) and the L (the load) can be taken care for you, and then as a developer you can focus on the T (the transformations)
By combining Kafka Connect and stream processing engines we can perform streaming ETLs. Each does the job it is best at, and Kafka acts as the underlying data storage layer that supports them and allows simple integration with a variety of other applications.
By using a robust framework that delivers scalability and fault tolerance out-of-the-box we can then focus on extracting value in a transformation layer.
deployments to deliver fault tolerance and automated load balancing
As you can see here, the basic pattern is to use Kafka Connect to perform Extraction of the data and load it into Kafka as a temporary, scalable, fault tolerant streaming data store. While you can do this with other, more generic data copy tools, you’ll commonly lose important semantics such as at least once delivery of data. Once the data is extracted, you use stream processing engines to perform Transformation and either this is the endpoint for the data or you can deliver it back into Kafka. Finally, Load the data with another Kafka connector into the destination system. Obviously this is a simplified picture and your pipeline will grow much more complex, have multiple stages of transformation (especially if the intermediate data is useful for reuse by multiple applications, including anything downstream that may not be processed by stream processing engines).
Most configurations are connector dependent, but there are a few settings common to all connectors
What we are introducing to all our Kafka connectors is the KCQL query
Let’s look at some of the more advanced features of KCQL - and in particular regarding some sinks.
Hazelcast for example supports the Ring Buffer Data structure, which is quite popular from the Disruptor pattern. Data can be pushed in a fixed-size buffer, with a particular retention period. If the buffer gets filled - an eviction policy will be triggered - to either evict oldest records, or deny the addition of new records.
So to write some IoT data from a Kafka topic into a Ring Buffer - we can use the STOREAS keyword.
On the right side, you can see how we can store the same data into a RELIABLE TOPIC - another hazel cast data-structure.
*Hazelcast requires data to be serializable, and JSON and Avro are supported.
Redis provides the Sorted Set data structure. This structure allows only unique elements to be added - and each element is required to be scored - to enforce ordering.
This data structure is oftenly used to preserve time-series data, as Redis allows running time-range queries.
So if we have a Kafka topic with Foreign Exchange data, we can either
-store all the messages into a SortedSet ( the one with the blue colour) OR-create a new SortedSet for each symbol ( one SortedSet for each currency rate ) using the PK syntax on the right
So this is a list of Apache 2.0 licensed Kafka Connectors that we have been working on.
Blockchain, Bloomberg, the Cassandra connector that is certified by DataStax, a Constrained Application Protocol connector, Elastic Search, JMS, MQTT and others are some of the connectors already available, and released against the 2 latest releases of Apache Kafka.
https://github.com/Landoop/fast-data-dev
So let’s see a DEMO in real-time
http://fast-data-dev.demo.landoop.com
So let’s see a DEMO in real-time
https://coyote.landoop.com/connect/
So let’s see a DEMO in real-time
http://schema-registry-ui.landoop.com
So let’s see a DEMO in real-time
http://kafka-topics-ui.landoop.com
So let’s see a DEMO in real-time
http://kafka-connect-ui.landoop.com
Connectors look overall simple - and i know a number of people in this room already using them in production. So how does performance look like ?
This image above demonstrates that depending on the sink system - we can sink 50 K records / sec by using:
20 partitions
3 connect tasks
5 GB RAM / connector
less than 2 CPUs
On the bottom-left corner - we can see that we have saturated 50% of the available network bandwidth.
Depending on the number of tasks and partitions - we can easily increase sink performance to more than 100K records / sec.
The lesson regarding performance is that:
Kafka Connect can scale really well
It requires quite some memory
and quite some CPUs especially if batching writes
We have also send Pull Requests to the prometheus team - to enable GZIP compression - to minimise any impact in the running system, something that has significantly decreased the network i/o
We then provide pre-built DashBoards on Grafana
We are using Grafana version 4.0 released a few months ago - that allows alerting that is a really revolutionary feature as it transforms Grafana from a visualisation tool into a truly mission critical monitoring tool
We’ll have a demo, but before going into it ..
Before doing a Live presentation - i’d like to answer a question :
How do i ship such a complex infrastructure that can easily grow into Hundreds of running services ?
We preferably use:
Deployment apps such as Ansible
Docker based technologies for state-less micro-services
CDH based integration with Cloudera Managed for CDH Hadoop clusters
https://docs.landoop.com/
CDH docs - https://docs.landoop.com/
More connectors are added monthly
Time-Travel in Kafka topics and KCQL queries and real-time