While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session I will share our lessons learned and top tips for building a Kafka Connect connector. I’ll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
Flink Forward San Francisco 2022.
At ThousandEyes we receive billions of events every day that allow us to monitor the internet; the most important aspect of our platform is to detect outages and anomalies that have a potential to cause serious impact to customer applications and user experience. Automatic detection of such events at lowest latency and highest accuracy is extremely important for our customers and their business. After launching several resilient and low latency data pipelines in production using Flink we decided to take it up a notch; we leveraged Flink to build statistical models in near real-time and apply them on incoming stream of events to detect anomalies! In this session we will deep dive into the design as well as discuss pitfalls and learnings while developing our real-time platform that leverages Debezium, Kafka, Flink, ElasticCache and DynamoDB to process events at scale!
by
Kunal Umrigar & Balint Kurnasz
Batch Processing at Scale with Flink & IcebergFlink Forward
Flink Forward San Francisco 2022.
Goldman Sachs's Data Lake platform serves as the firm's centralized data platform, ingesting 140K (and growing!) batches per day of Datasets of varying shape and size. Powered by Flink and using metadata configured by platform users, ingestion applications are generated dynamically at runtime to extract, transform, and load data into centralized storage where it is then exported to warehousing solutions such as Sybase IQ, Snowflake, and Amazon Redshift. Data Latency is one of many key considerations as producers and consumers have their own commitments to satisfy. Consumers range from people/systems issuing queries, to applications using engines like Spark, Hive, and Presto to transform data into refined Datasets. Apache Iceberg allows our applications to not only benefit from consistency guarantees important when running on eventually consistent storage like S3, but also allows us the opportunity to improve our batch processing patterns with its scalability-focused features.
by
Andreas Hailu
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Apache Pulsar Development 101 with PythonTimothy Spann
Apache Pulsar Development 101 with Python PS2022_Ecosystem_v0.0
There is always the fear a speaker cannot make it. So just in case, since I was the MC for the ecosystem track I put together a talk just in case.
Here it is. Never seen or presented.
Apache Kafka becoming the message bus to transfer huge volumes of data from various sources into Hadoop.
It's also enabling many real-time system frameworks and use cases.
Managing and building clients around Apache Kafka can be challenging. In this talk, we will go through the best practices in deploying Apache Kafka
in production. How to Secure a Kafka Cluster, How to pick topic-partitions and upgrading to newer versions. Migrating to new Kafka Producer and Consumer API.
Also talk about the best practices involved in running a producer/consumer.
In Kafka 0.9 release, we’ve added SSL wire encryption, SASL/Kerberos for user authentication, and pluggable authorization. Now Kafka allows authentication of users, access control on who can read and write to a Kafka topic. Apache Ranger also uses pluggable authorization mechanism to centralize security for Kafka and other Hadoop ecosystem projects.
We will showcase open sourced Kafka REST API and an Admin UI that will help users in creating topics, re-assign partitions, Issuing
Kafka ACLs and monitoring Consumer offsets.
Big Data Kappa | Mark Senerth, The Walt Disney Company - DMED, Data TechHostedbyConfluent
In a world where there is an ever growing shift towards event driven streaming data, Kafka is firmly embedded in the epicenter of any Data Platform’s central nervous system. In an attempt to aide in the shift of analytics towards true event time, we have implemented a pure Kappa architecture - effectively turning the database inside out. Through extending the concept of a truly idempotent stream of events, Kafka has been elevated to the source of truth. We have eliminated extra network trips for joins as well as querying state which has significantly improved processing performance while also reducing processing latency. Tune in to discuss challenges, tips and lessons learned while implementing a pure Kappa Architecture. I will address hurdles such as scaling, warm standbys, schema evolution, and batch replay strategies - highlighting issues prevalent with any streaming Kappa based architecture. Streaming big data in and of itself comes with its own set of challenges - such as serialization formats, encryption, and strategies to efficiently utilize message headers. I invite each and every one of you to embark on a journey discussing a means to an end - resulting in processing billions of records each day.
Apache Flink is a popular stream computing framework for real-time stream computing. Many stream compute algorithms require trailing data in order to compute the intended result. One example is computing the number of user logins in the last 7 days. This creates a dilemma where the results of the stream program are incomplete until the runtime of the program exceeds 7 days. The alternative is to bootstrap the program using historic data to seed the state before shifting to use real-time data.
This talk will discuss alternatives to bootstrap programs in Flink. Some alternatives rely on technologies exogenous to the stream program, such as enhancements to the pub/sub layer, that are more generally applicable to other stream compute engines. Other alternatives include enhancements to Flink source implementations. Lyft is exploring another alternative using orchestration of multiple Flink programs. The talk will cover why Lyft pursued this alternative and future directions to further enhance bootstrapping support in Flink.
Speaker
Gregory Fee, Principal Engineer, Lyft
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
Flink Forward San Francisco 2022.
At ThousandEyes we receive billions of events every day that allow us to monitor the internet; the most important aspect of our platform is to detect outages and anomalies that have a potential to cause serious impact to customer applications and user experience. Automatic detection of such events at lowest latency and highest accuracy is extremely important for our customers and their business. After launching several resilient and low latency data pipelines in production using Flink we decided to take it up a notch; we leveraged Flink to build statistical models in near real-time and apply them on incoming stream of events to detect anomalies! In this session we will deep dive into the design as well as discuss pitfalls and learnings while developing our real-time platform that leverages Debezium, Kafka, Flink, ElasticCache and DynamoDB to process events at scale!
by
Kunal Umrigar & Balint Kurnasz
Batch Processing at Scale with Flink & IcebergFlink Forward
Flink Forward San Francisco 2022.
Goldman Sachs's Data Lake platform serves as the firm's centralized data platform, ingesting 140K (and growing!) batches per day of Datasets of varying shape and size. Powered by Flink and using metadata configured by platform users, ingestion applications are generated dynamically at runtime to extract, transform, and load data into centralized storage where it is then exported to warehousing solutions such as Sybase IQ, Snowflake, and Amazon Redshift. Data Latency is one of many key considerations as producers and consumers have their own commitments to satisfy. Consumers range from people/systems issuing queries, to applications using engines like Spark, Hive, and Presto to transform data into refined Datasets. Apache Iceberg allows our applications to not only benefit from consistency guarantees important when running on eventually consistent storage like S3, but also allows us the opportunity to improve our batch processing patterns with its scalability-focused features.
by
Andreas Hailu
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Apache Pulsar Development 101 with PythonTimothy Spann
Apache Pulsar Development 101 with Python PS2022_Ecosystem_v0.0
There is always the fear a speaker cannot make it. So just in case, since I was the MC for the ecosystem track I put together a talk just in case.
Here it is. Never seen or presented.
Apache Kafka becoming the message bus to transfer huge volumes of data from various sources into Hadoop.
It's also enabling many real-time system frameworks and use cases.
Managing and building clients around Apache Kafka can be challenging. In this talk, we will go through the best practices in deploying Apache Kafka
in production. How to Secure a Kafka Cluster, How to pick topic-partitions and upgrading to newer versions. Migrating to new Kafka Producer and Consumer API.
Also talk about the best practices involved in running a producer/consumer.
In Kafka 0.9 release, we’ve added SSL wire encryption, SASL/Kerberos for user authentication, and pluggable authorization. Now Kafka allows authentication of users, access control on who can read and write to a Kafka topic. Apache Ranger also uses pluggable authorization mechanism to centralize security for Kafka and other Hadoop ecosystem projects.
We will showcase open sourced Kafka REST API and an Admin UI that will help users in creating topics, re-assign partitions, Issuing
Kafka ACLs and monitoring Consumer offsets.
Big Data Kappa | Mark Senerth, The Walt Disney Company - DMED, Data TechHostedbyConfluent
In a world where there is an ever growing shift towards event driven streaming data, Kafka is firmly embedded in the epicenter of any Data Platform’s central nervous system. In an attempt to aide in the shift of analytics towards true event time, we have implemented a pure Kappa architecture - effectively turning the database inside out. Through extending the concept of a truly idempotent stream of events, Kafka has been elevated to the source of truth. We have eliminated extra network trips for joins as well as querying state which has significantly improved processing performance while also reducing processing latency. Tune in to discuss challenges, tips and lessons learned while implementing a pure Kappa Architecture. I will address hurdles such as scaling, warm standbys, schema evolution, and batch replay strategies - highlighting issues prevalent with any streaming Kappa based architecture. Streaming big data in and of itself comes with its own set of challenges - such as serialization formats, encryption, and strategies to efficiently utilize message headers. I invite each and every one of you to embark on a journey discussing a means to an end - resulting in processing billions of records each day.
Apache Flink is a popular stream computing framework for real-time stream computing. Many stream compute algorithms require trailing data in order to compute the intended result. One example is computing the number of user logins in the last 7 days. This creates a dilemma where the results of the stream program are incomplete until the runtime of the program exceeds 7 days. The alternative is to bootstrap the program using historic data to seed the state before shifting to use real-time data.
This talk will discuss alternatives to bootstrap programs in Flink. Some alternatives rely on technologies exogenous to the stream program, such as enhancements to the pub/sub layer, that are more generally applicable to other stream compute engines. Other alternatives include enhancements to Flink source implementations. Lyft is exploring another alternative using orchestration of multiple Flink programs. The talk will cover why Lyft pursued this alternative and future directions to further enhance bootstrapping support in Flink.
Speaker
Gregory Fee, Principal Engineer, Lyft
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
Apache Flink is the foundation for Decodable's real-time SaaS data platform. Flink runs critical data processing jobs with strong security requirements. In addition, Decodable has to scale to thousands of tenants, power various use cases, provide an intuitive user experience and maintain cost-efficiency. We've learned a lot of lessons while building and maintaining the platform. In this talk, I'll share the top 3 toughest challenges building and operating this platform with Flink, and how we solved them.
Implementing Exactly-once Delivery and Escaping Kafka Rebalance Storms with Y...HostedbyConfluent
"Even though stream processing has come a long way in the last few years, ensuring exactly-once delivery remains a difficult problem to solve.
This becomes an even bigger challenge when your consumers are distributed applications, and their Kubernetes pods can be scaled-out, scaled-in or simply restarted at any given moment, causing Apache Kafka to go into a “rebalance storm”.
In this talk, we’ll walk you through how we implemented exactly-once delivery with Kafka by managing Kafka transactions the right way, and how we escaped endless rebalance storms when running hundreds of consumers on the same Kafka topic.
We will discuss the issues we faced building Akamai’s data ingestion infrastructure on Azure, processing malicious traffic at internet scale.
The session covers:
- Kafka delivery semantics
- Kafka transactional API
- Kafka “anti-rebalance” tips and tricks"
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Microservices usually communicate synchronously, often using REST APIs. There is an alternative way of using events. By adopting an event-based approach for intercommunication between microservices, the microservices applications are naturally responsive (event-driven). This approach enhances the loose coupling nature of microservices because it decouples producers and consumers. Andrew Schofield - Chief Architect, Event Streams at IBM
https://www.linkedin.com/in/andrewjschofield/
https://www.meetup.com/Wellington-API-and-Microservices/events/259074920/
Best practices and lessons learnt from Running Apache NiFi at RenaultDataWorks Summit
No real-time insight without real-time data ingestion. No real-time data ingestion without NiFi ! Apache NiFi is an integrated platform for data flow management at entreprise level, enabling companies to securely acquire, process and analyze disparate sources of information (sensors, logs, files, etc) in real-time. NiFi helps data engineers accelerate the development of data flows thanks to its UI and a large number of powerful off-the-shelf processors. However, with great power comes great responsibilities. Behind the simplicity of NiFi, best practices must absolutely be respected in order to scale data flows in production & prevent sneaky situations. In this joint presentation, Hortonworks and Renault, a French car manufacturer, will present lessons learnt from real world projects using Apache NiFi. We will present NiFi design patterns to achieve high level performance and reliability at scale as well as the process to put in place around the technology for data flow governance. We will also show how these best practices can be implemented in practical use cases and scenarios.
Speakers
Kamelia Benchekroun, Data Lake Squad Lead, Renault Group
Abdelkrim Hadjidj, Solution Engineer, Hortonworks
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...Amazon Web Services
Discover the power of running Apache Kafka on a fully managed AWS service. In this session, we describe how Amazon Managed Streaming for Kafka (Amazon MSK) runs Apache Kafka clusters for you, demo Amazon MSK and a migration, show you how to get started, and walk through other important details about the new service.
by Isaiah Weiner, Sr. Manager of Solutions Architecture, AWS
Companies are using AWS to create and deploy efficient, fast, and cost-effective backup and restore capabilities to protect critical IT systems without incurring the infrastructure expense of a second physical site. In this session, we will talk about cloud-based services AWS provides to enable robust backup and rapid recovery of your IT infrastructure and data.
3 Kafka patterns to deliver Streaming Machine Learning models with Andrea Spi...HostedbyConfluent
The presentation highlights the main technical challenges Radicalbit faced while building a real-time serving engine for streaming Machine Learning algorithms. The speech describes how Kafka has been used to fasten two ML technologies together: River, an open-source suite of streaming machine learning algorithms, and Seldon-core, a DevOps-driven MLOps platform.
In particular, the talk focuses on how Kafka has been used to (1) build a dynamic model serving framework thanks to Kafka Streams joins and the broadcasting pattern (2) implement a Kafka user-given feedback topic by which online models can learn while they generate predictions, and (3) design a models' prediction bus, a particular Kafka bidirectional topic whereby predictions flow at tremendous scale; the prediction bus enabled seldon-core Kubernetes deployment to communicate with Kafka Streams, and as a conclusive subject this speech explains how this unleashed unprecedented performance.
Enterprise messaging and IBM MQ is a critical part of any system, this session shows you how MQ is rapidly evolving to meet your needs. Irrespective of your platform or environment, this session introduces many of the updates to MQ in 2019 and 2020, whether that's in administration, building fault tolerant, scalable messaging solutions, or securing your systems.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
Flink Forward San Francisco 2022.
This talk will take you on the long journey of Apache Flink into the cloud-native era. It started all the way from where Hadoop and YARN were the standard way of deploying and operating data applications.
We're going to deep dive into the cloud-native set of principles and how they map to the Apache Flink internals and recent improvements. We'll cover fast checkpointing, fault tolerance, resource elasticity, minimal infrastructure dependencies, industry-standard tooling, ease of deployment and declarative APIs.
After this talk you'll get a broader understanding of the operational requirements for a modern streaming application and where the current limits are.
by
David Moravek
Deploying Flink on Kubernetes - David AndersonVerverica
Kubernetes has rapidly established itself as the de facto standard for orchestrating containerized infrastructures. And with the recent completion of the refactoring of Flink's deployment and process model known as FLIP-6, Kubernetes has become a natural choice for Flink deployments. In this talk we will walk through how to get Flink running on Kubernetes
In this slide deck we show how to implement custom Kafka Serializer for Producer. We then show how failover works configuring when broker/topic config min.insync.replicas, and Producer config acks (0, 1, -1, none, leader, all).
Then tutorial show how to implement Kafka producer batching and compression. Then use Producer metrics API to see how batching and compression improves throughput. Then this tutorial covers using retires and timeouts, and tested that it works. It explains how the setup of max inflight messages and retry back off work and when to use and not use inflight messaging.
It goes on to who how to implement a ProducerInterceptor. Then lastly, it shows how to implement a custom Kafka partitioner to implement a priority queue for important records. Through many of the step by step examples, this tutorial shows how to use some of the Kafka tools to do replication verification, and inspect the topic partition leadership status.
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...confluent
While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session we will share our lessons learned and top tips for building a Kafka Connect connector. We'll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
Apache Flink is the foundation for Decodable's real-time SaaS data platform. Flink runs critical data processing jobs with strong security requirements. In addition, Decodable has to scale to thousands of tenants, power various use cases, provide an intuitive user experience and maintain cost-efficiency. We've learned a lot of lessons while building and maintaining the platform. In this talk, I'll share the top 3 toughest challenges building and operating this platform with Flink, and how we solved them.
Implementing Exactly-once Delivery and Escaping Kafka Rebalance Storms with Y...HostedbyConfluent
"Even though stream processing has come a long way in the last few years, ensuring exactly-once delivery remains a difficult problem to solve.
This becomes an even bigger challenge when your consumers are distributed applications, and their Kubernetes pods can be scaled-out, scaled-in or simply restarted at any given moment, causing Apache Kafka to go into a “rebalance storm”.
In this talk, we’ll walk you through how we implemented exactly-once delivery with Kafka by managing Kafka transactions the right way, and how we escaped endless rebalance storms when running hundreds of consumers on the same Kafka topic.
We will discuss the issues we faced building Akamai’s data ingestion infrastructure on Azure, processing malicious traffic at internet scale.
The session covers:
- Kafka delivery semantics
- Kafka transactional API
- Kafka “anti-rebalance” tips and tricks"
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Microservices usually communicate synchronously, often using REST APIs. There is an alternative way of using events. By adopting an event-based approach for intercommunication between microservices, the microservices applications are naturally responsive (event-driven). This approach enhances the loose coupling nature of microservices because it decouples producers and consumers. Andrew Schofield - Chief Architect, Event Streams at IBM
https://www.linkedin.com/in/andrewjschofield/
https://www.meetup.com/Wellington-API-and-Microservices/events/259074920/
Best practices and lessons learnt from Running Apache NiFi at RenaultDataWorks Summit
No real-time insight without real-time data ingestion. No real-time data ingestion without NiFi ! Apache NiFi is an integrated platform for data flow management at entreprise level, enabling companies to securely acquire, process and analyze disparate sources of information (sensors, logs, files, etc) in real-time. NiFi helps data engineers accelerate the development of data flows thanks to its UI and a large number of powerful off-the-shelf processors. However, with great power comes great responsibilities. Behind the simplicity of NiFi, best practices must absolutely be respected in order to scale data flows in production & prevent sneaky situations. In this joint presentation, Hortonworks and Renault, a French car manufacturer, will present lessons learnt from real world projects using Apache NiFi. We will present NiFi design patterns to achieve high level performance and reliability at scale as well as the process to put in place around the technology for data flow governance. We will also show how these best practices can be implemented in practical use cases and scenarios.
Speakers
Kamelia Benchekroun, Data Lake Squad Lead, Renault Group
Abdelkrim Hadjidj, Solution Engineer, Hortonworks
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...Amazon Web Services
Discover the power of running Apache Kafka on a fully managed AWS service. In this session, we describe how Amazon Managed Streaming for Kafka (Amazon MSK) runs Apache Kafka clusters for you, demo Amazon MSK and a migration, show you how to get started, and walk through other important details about the new service.
by Isaiah Weiner, Sr. Manager of Solutions Architecture, AWS
Companies are using AWS to create and deploy efficient, fast, and cost-effective backup and restore capabilities to protect critical IT systems without incurring the infrastructure expense of a second physical site. In this session, we will talk about cloud-based services AWS provides to enable robust backup and rapid recovery of your IT infrastructure and data.
3 Kafka patterns to deliver Streaming Machine Learning models with Andrea Spi...HostedbyConfluent
The presentation highlights the main technical challenges Radicalbit faced while building a real-time serving engine for streaming Machine Learning algorithms. The speech describes how Kafka has been used to fasten two ML technologies together: River, an open-source suite of streaming machine learning algorithms, and Seldon-core, a DevOps-driven MLOps platform.
In particular, the talk focuses on how Kafka has been used to (1) build a dynamic model serving framework thanks to Kafka Streams joins and the broadcasting pattern (2) implement a Kafka user-given feedback topic by which online models can learn while they generate predictions, and (3) design a models' prediction bus, a particular Kafka bidirectional topic whereby predictions flow at tremendous scale; the prediction bus enabled seldon-core Kubernetes deployment to communicate with Kafka Streams, and as a conclusive subject this speech explains how this unleashed unprecedented performance.
Enterprise messaging and IBM MQ is a critical part of any system, this session shows you how MQ is rapidly evolving to meet your needs. Irrespective of your platform or environment, this session introduces many of the updates to MQ in 2019 and 2020, whether that's in administration, building fault tolerant, scalable messaging solutions, or securing your systems.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
Flink Forward San Francisco 2022.
This talk will take you on the long journey of Apache Flink into the cloud-native era. It started all the way from where Hadoop and YARN were the standard way of deploying and operating data applications.
We're going to deep dive into the cloud-native set of principles and how they map to the Apache Flink internals and recent improvements. We'll cover fast checkpointing, fault tolerance, resource elasticity, minimal infrastructure dependencies, industry-standard tooling, ease of deployment and declarative APIs.
After this talk you'll get a broader understanding of the operational requirements for a modern streaming application and where the current limits are.
by
David Moravek
Deploying Flink on Kubernetes - David AndersonVerverica
Kubernetes has rapidly established itself as the de facto standard for orchestrating containerized infrastructures. And with the recent completion of the refactoring of Flink's deployment and process model known as FLIP-6, Kubernetes has become a natural choice for Flink deployments. In this talk we will walk through how to get Flink running on Kubernetes
In this slide deck we show how to implement custom Kafka Serializer for Producer. We then show how failover works configuring when broker/topic config min.insync.replicas, and Producer config acks (0, 1, -1, none, leader, all).
Then tutorial show how to implement Kafka producer batching and compression. Then use Producer metrics API to see how batching and compression improves throughput. Then this tutorial covers using retires and timeouts, and tested that it works. It explains how the setup of max inflight messages and retry back off work and when to use and not use inflight messaging.
It goes on to who how to implement a ProducerInterceptor. Then lastly, it shows how to implement a custom Kafka partitioner to implement a priority queue for important records. Through many of the step by step examples, this tutorial shows how to use some of the Kafka tools to do replication verification, and inspect the topic partition leadership status.
Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley &...confluent
While many companies are embracing Apache Kafka as their core event streaming platform they may still have events they want to unlock in other systems. Kafka Connect provides a common API for developers to do just that and the number of open-source connectors available is growing rapidly. The IBM MQ sink and source connectors allow you to flow messages between your Apache Kafka cluster and your IBM MQ queues. In this session we will share our lessons learned and top tips for building a Kafka Connect connector. We'll explain how a connector is structured, how the framework calls it and some of the things to consider when providing configuration options. The more Kafka Connect connectors the community creates the better, as it will enable everyone to unlock the events in their existing systems.
Virtual Meetup Sweden - Reacting to an event driven worldGrace Jansen
We now live in a world with data at its heart. The amount of data being produced every day is growing exponentially and a large amount of this data is in the form of events. Whether it be updates from sensors, clicks on a website or even tweets, applications are bombarded with a never-ending stream of new events. So, how can we architect our applications to be more reactive and resilient to these fluctuating loads and better manage our thirst for data? In this session explore how Kafka and Reactive application architecture can be combined in applications to better handle our modern data needs.
Fast Kafka Apps! (Edoardo Comar and Mickael Maison, IBM) Kafka Summit London ...confluent
We all know Kafka is designed to allow applications to produce and consume data with high throughput and low latency, right? In practice, achieving this goal requires some tuning. Starting with a consideration of design principles and best practices for distributed applications, we’ll explore various practical tips to improve your client application’s performance. We’ll look first at the most important producer and consumer configuration options because these apply even when you have limited control of the brokers’ cluster setup. Secondly, we’ll look at how the brokers’ cluster deployment, topology and configuration could help further. Finally a word on some gotchas of performance analysis, that apply to Kafka too!
JSpring Virtual 2020 - Reacting to an event-driven worldGrace Jansen
We now live in a world with data at its heart. The amount of data being produced every day is growing exponentially and a large amount of this data is in the form of events. Whether it be updates from sensors, clicks on a website or even tweets, applications are bombarded with a never-ending stream of new events. So, how can we architect our applications to be more reactive and resilient to these fluctuating loads and better manage our thirst for data? In this session explore how Kafka and Reactive application architecture can be combined in applications to better handle our modern data needs.
DevNexus - Reacting to an event driven worldGrace Jansen
We now live in a world with data at its heart. The amount of data being produced every day is growing exponentially and a large amount of this data is in the form of events. Whether it be updates from sensors, clicks on a website or even tweets, applications are bombarded with a never-ending stream of new events. So, how can we architect our applications to be more reactive and resilient to these fluctuating loads and better manage our thirst for data? In this session explore how Kafka and Reactive application architecture can be combined in applications to better handle our modern data needs.
Running Kafka in Kubernetes: A Practical Guide (Katherine Stanley, IBM United...confluent
The rise of Apache Kafka as the de facto standard for event streaming has coincided with the rise of Kubernetes for cloud-native applications. While Kubernetes is a great choice for any distributed system, that doesn’t mean it is easy to deploy and maintain a Kafka cluster running on it. At IBM we have hands-on experience with running Kafka in Kubernetes and in this session I will share our top tips for a smooth ride. I will show an example deployment of Kafka on Kubernetes and step through the system to explain the common pitfalls and how to avoid them. This will include the Kubernetes objects to use, resource considerations and connecting applications to the cluster. Finally, I will discuss useful Kafka metrics to include in Kubernetes liveness and readiness probes.
Jfokus - Reacting to an event-driven worldGrace Jansen
We now live in a world with data at its heart. The amount of data being produced every day is growing exponentially and a large amount of this data is in the form of events. Whether it be updates from sensors, clicks on a website or even tweets, applications are bombarded with a never-ending stream of new events. So, how can we architect our applications to be more reactive and resilient to these fluctuating loads and better manage our thirst for data? In this session explore how Kafka and Reactive application architecture can be combined in applications to better handle our modern data needs.
Speaker: Frank Pientka, Principal Software Architect, Materna Information & Communications SE
Title of Talk:
The need for speed – Data streaming in the Cloud with Kafka®
Abstract:
As Kubernetes is quickly becoming the de facto standard for the cloud operating system is Apache Kafka becoming the data streaming.
Enterprise need more speed to get insights from fast growing data.
Kafka and Kubernetes are a perfect team for these use cases.
There are different options to run an Apache Kafka Cluster.
Besides managed a Kafka cluster by the different cloud providers, running Kafka on Kubernetes is becoming more and more popular.
We will introduce a setup, used components and recommendations from an own project with Kafka on Kubernetes.
Finally we will share our lessons learned from this still evolving field
Flink currently features different APIs for bounded/batch (DataSet) and streaming (DataStream) programs. And while the DataStream API can handle batch use cases, it is much less efficient in that compared to the DataSet API. The Table API was built as a unified API on top of both, to cover batch and streaming with the same API, and under the hood delegate to either DataSet or DataStream.
In this talk, we present the latest on the Flink community's efforts to rework the APIs and the stack for better unified batch & streaming experience. We will discuss:
- The future roles and interplay of DataSet, DataStream, and Table API
- The new Flink stack and the abstractions on which these APIs will build
- The new unified batch/streaming sources
- How batch and streaming optimizations differ in the runtime, and what the future interplay of batch and streaming execution could look like.
Flink currently features different APIs for bounded/batch (DataSet) and streaming (DataStream) programs. And while the DataStream API can handle batch use cases, it is much less efficient in that compared to the DataSet API. The Table API was built as a unified API on top of both, to cover batch and streaming with the same API, and under the hood delegate to either DataSet or DataStream.
In this talk, we present the latest on the Flink community's efforts to rework the APIs and the stack for better unified batch & streaming experience. We will discuss:
- The future roles and interplay of DataSet, DataStream, and Table API
- The new Flink stack and the abstractions on which these APIs will build
- The new unified batch/streaming sources
- How batch and streaming optimizations differ in the runtime, and what the future interplay of batch and streaming execution could look like
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQHostedbyConfluent
Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of _temporal_ join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result.
In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareHostedbyConfluent
Spring Cloud Stream is a framework built on top of the foundations of Spring Boot, the foremost JVM framework for developing microservice applications. It brings the familiar patterns and philosophies that Spring has championed for years through its programming model by allowing developers to focus primarily on the business logic of their applications. Kafka Streams is a powerful stream processing library built on top of Apache Kafka and attracts many developers because of its simplicity and deployment models as microservice applications. By developing Kafka Streams applications using Spring Cloud Stream, application developers get the best of both worlds - simpler stream processing execution models of Kafka Streams and battle-tested microservices foundations of Spring Boot via Spring Cloud Stream. This talk will explore: The integration points and various capabilities of Spring Cloud Stream touchpoints with Kafka Streams How to build event streaming applications using Spring’s programming model built on top of Kafka Streams, including a demo of a stateful application using Kafka Streams and Spring Cloud Stream’s functional support How to use interactive queries to expose materialized views from the state stores in the application How this Kafka Streams application can run as part of a data pipeline using Spring Cloud Data Flow in Kubernetes
App modernization on AWS with Apache Kafka and Confluent CloudKai Wähner
Presentation from AWS ReInvent 2020.
Learn how you can accelerate application modernization and benefit from the open-source Apache Kafka ecosystem by connecting your legacy, on-premises systems to the cloud. In this session, hear real customer stories about timely insights gained from event-driven applications built on an event streaming platform from Confluent Cloud running on AWS, which stores and processes historical data and real-time data streams. Confluent makes Apache Kafka enterprise-ready using infinite Kafka storage with Amazon S3 and multiple private networking options including AWS PrivateLink, along with self-managed encryption keys for storage volume encryption with AWS Key Management Service (AWS KMS).
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
After a quick overview and introduction of Apache Kafka, this session cover two components which extend the core of Apache Kafka: Kafka Connect and Kafka Streams/KSQL.
Kafka Connects role is to access data from the out-side-world and make it available inside Kafka by publishing it into a Kafka topic. On the other hand, Kafka Connect is also responsible to transport information from inside Kafka to the outside world, which could be a database or a file system. There are many existing connectors for different source and target systems available out-of-the-box, either provided by the community or by Confluent or other vendors. You simply configure these connectors and off you go.
Kafka Streams is a light-weight component which extends Kafka with stream processing functionality. By that, Kafka can now not only reliably and scalable transport events and messages through the Kafka broker but also analyse and process these event in real-time. Interestingly Kafka Streams does not provide its own cluster infrastructure and it is also not meant to run on a Kafka cluster. The idea is to run Kafka Streams where it makes sense, which can be inside a “normal” Java application, inside a Web container or on a more modern containerized (cloud) infrastructure, such as Mesos, Kubernetes or Docker. Kafka Streams has a lot of interesting features, such as reliable state handling, queryable state and much more. KSQL is a streaming engine for Apache Kafka, providing a simple and completely interactive SQL interface for processing data in Kafka.
Fast Insight from Fast Data: Integrating ClickHouse and Apache KafkaAltinity Ltd
Slides from webinar, January 21, 2020. Rober Hodges and Mikhail Filimonov, Altinity
Similar to Lessons Learned Building a Connector Using Kafka Connect (Katherine Stanley & Andrew Schofield, IBM United Kingdom) Kafka Summit NYC 2019 (20)
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
In our exclusive webinar, you'll learn why event-driven architecture is the key to unlocking cost efficiency, operational effectiveness, and profitability. Gain insights on how this approach differs from API-driven methods and why it's essential for your organization's success.
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
In today's data-driven world, the Internet of Things (IoT) is revolutionizing industries and unlocking new possibilities. Join Data Reply, Confluent, and Imply as we unveil a comprehensive solution for IoT that harnesses the power of real-time insights.
Workshop híbrido: Stream Processing con Flinkconfluent
El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real.
Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real.
En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace.
In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms.
You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes.
Don't miss out on this opportunity to learn from industry experts and take your business to the next level.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Eventos y Microservicios - Santander TechTalkconfluent
Durante esta sesión examinaremos cómo el mundo de los eventos y los microservicios se complementan y mejoran explorando cómo los patrones basados en eventos nos permiten descomponer monolitos de manera escalable, resiliente y desacoplada.
Purpose of the session is to have a dive into Apache, Kafka, Data Streaming and Kafka in the cloud
- Dive into Apache Kafka
- Data Streaming
- Kafka in the cloud
Build real-time streaming data pipelines to AWS with Confluentconfluent
Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
No matter whether you are migrating your Kafka cluster to Confluent Cloud, running a cloud-hybrid environment or are in a different situation where data protection and encryption of sensitive information is required, Confluent Service Mesh allows you to transparently encrypt your data without the need to make code changes to you existing applications.
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
Microservices have become a dominant architectural paradigm for building systems in the enterprise, but they are not without their tradeoffs. Learn how to build event-driven microservices with Apache Kafka
Confluent & GSI Webinars series - Session 3confluent
An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities.
It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks.
This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.
Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud
How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?
This session will show why the old paradigm does not work and that a new approach to the data strategy needs to be taken. It aims to show how a Data Streaming Platform is integral to the evolution of a company’s data strategy and how Confluent is not just an integration layer but the central nervous system for an organisation
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
Confluent Partner Tech Talk with Synthesisconfluent
A discussion on the arduous planning process, and deep dive into the design/architectural decisions.
Learn more about the networking, RBAC strategies, the automation, and the deployment plan.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Slide to show difference between Kafka and MQ: stream history vs reliable delivery
SourceConnector – import from other system
SinkConnector – export to other system
Run a cluster of worker process -> start them using the CLI
Then when you start a worker give it an idea and connectors that run will run on any worker and put tasks on any worker -> parallelism
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# The converters control conversion of data between the internal Kafka Connect representation and the messages in Kafka.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
You have the connector class which is used to connect to Kafka
The task class which does the processing of data into a format for Kafka
And then optional transformations
Start
- Parse config
- Only called on “clean” Connector
Start – initialize and one-time setup
Poll - Get new records from the third-party system, block if no data
Commit and CommitRecord – Optional methods to keep track of offsets internally
CommitRecord - Commit an individual SourceRecord when the callback from the producer client is received, or if a record is filtered by a transformation.
Put - Write records to third-party system
Flush - Optional method to prompt flushing all records that have been ‘put’
Provides scalability and reliability when connecting systems
Look out for existing connectors
Writing your own has some subtleties
Your external system’s API and Kafka Connect might not align