No matter whether you are migrating your Kafka cluster to Confluent Cloud, running a cloud-hybrid environment or are in a different situation where data protection and encryption of sensitive information is required, Confluent Service Mesh allows you to transparently encrypt your data without the need to make code changes to you existing applications.
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
Streaming data applications can deliver compelling, near real-time user experiences, but building the back-end infrastructure to collect and process streaming data is difficult. Amazon Kinesis Firehose makes it easy for you to load streaming data into AWS without having to build custom stream processing applications. In this webinar, we will introduce Amazon Kinesis Firehose and discuss how to ingest streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service using Amazon Kinesis Firehose. We will also highlight key use cases based on real-world examples from IoT, AdTech, E-Commerce, and Gaming. Join us to: - Get an introduction to streaming data and an overview of Amazon Kinesis Firehose - Learn about common streaming data use cases from IoT, Ad Tech, E-Commerce, and Gaming - Understand how to use Amazon Kinesis Firehose to load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Who should attend: Developers, data analysts, data engineers, architects
Introducing Apache Kafka - a visual overview. Presented at the Canberra Big Data Meetup 7 February 2019. We build a Kafka "postal service" to explain the main Kafka concepts, and explain how consumers receive different messages depending on whether there's a key or not.
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
In this session, Netflix provides an overview of Keystone, their new data pipeline. The session covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 400 billion events daily. The session covers in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak.
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
Kafka as a streaming data platform is becoming the successor to traditional messaging systems such as RabbitMQ. Nevertheless, there are still some use cases where they could be a good fit. This one single slide tries to answer in a concise and unbiased way where to use Apache Kafka and where to use RabbitMQ. Your comments and feedback are much appreciated.
Kafka and Confluent are nice, but what about the integration with public clouds like Azure. Or even better, to integrate Kafka and Confluent with a managed API management like Azure API Gateway.
In this talk I will show you how it is possible to integrate an event streaming platform like Confluent into an enterprise API Management and different other services to build up a lambda based data platform architecture.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
Streaming data applications can deliver compelling, near real-time user experiences, but building the back-end infrastructure to collect and process streaming data is difficult. Amazon Kinesis Firehose makes it easy for you to load streaming data into AWS without having to build custom stream processing applications. In this webinar, we will introduce Amazon Kinesis Firehose and discuss how to ingest streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service using Amazon Kinesis Firehose. We will also highlight key use cases based on real-world examples from IoT, AdTech, E-Commerce, and Gaming. Join us to: - Get an introduction to streaming data and an overview of Amazon Kinesis Firehose - Learn about common streaming data use cases from IoT, Ad Tech, E-Commerce, and Gaming - Understand how to use Amazon Kinesis Firehose to load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Who should attend: Developers, data analysts, data engineers, architects
Introducing Apache Kafka - a visual overview. Presented at the Canberra Big Data Meetup 7 February 2019. We build a Kafka "postal service" to explain the main Kafka concepts, and explain how consumers receive different messages depending on whether there's a key or not.
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
In this session, Netflix provides an overview of Keystone, their new data pipeline. The session covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 400 billion events daily. The session covers in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak.
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
Kafka as a streaming data platform is becoming the successor to traditional messaging systems such as RabbitMQ. Nevertheless, there are still some use cases where they could be a good fit. This one single slide tries to answer in a concise and unbiased way where to use Apache Kafka and where to use RabbitMQ. Your comments and feedback are much appreciated.
Kafka and Confluent are nice, but what about the integration with public clouds like Azure. Or even better, to integrate Kafka and Confluent with a managed API management like Azure API Gateway.
In this talk I will show you how it is possible to integrate an event streaming platform like Confluent into an enterprise API Management and different other services to build up a lambda based data platform architecture.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:
Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.
LDM Slides: Data Modeling for XML and JSONDATAVERSITY
Data modeling has traditionally focused on relational database systems. But in the age of the internet, technologies such as XML and JSON have evolved to provide structure and definition to “data in motion”. Have data modeling technologies evolved to support these technologies? Can we use traditional approaches to model data in XML and JSON? Or are new tools and methodologies required? Join this webinar to discuss:
- XML & JSON vs. Relational Database Modeling
- Techniques & Tools for Data Modeling for XML
- Techniques & Tools for Data Modeling for JSON
- Use Cases & Opportunities for XML and JSON Data Modeling
Confluent Cloud Networking | Rajan Sundaram, ConfluentHostedbyConfluent
Introduction to networking options available in Confluent Cloud Self Serve provisioning of confluent Kafka clusters. VPC Peering, VNet Peering, Transit Gateway and Private Link Options for AWS, GCP, Azure networking offering. Caveats of confluent's cloud networking solutions customers should be aware of. Details of two major pieces of the architecture of Confluent Cloud - Data Plane Network and Control Plane.
Azure and Kubernetes go together like peanut butter and jelly with Azure offering many options to host Kubernetes. In this session, we'll show you how to mix the Open Source tools you already use with the powerful Kubernetes hosting options on Azure. Take your deployment and orchestration to the next level!
Ranger’s pluggable architecture allows resource access policy administration and enforcement for standard and custom services from a “single pane of glass”. Apache Ranger has a rich Authorization Model, which provides the mechanism to author Policy in a Ranger Admin Server and serves as policy decision and audit point in authorizing user’s resource access within various components of Hadoop ecosystem.
This session will provide a deep dive into Ranger framework and a cook-book for extending Ranger to do authorization / auditing on resource access to external applications, including technical details of Rest APIs, Ranger policy engine and enriching authorization requests, with a demo of a sample application.We will then demonstrate a real-world example of how Ranger has simplified security enforcement for Hadoop-native MPP SQL engine like Apache HAWQ (incubating),which previously used its built-in Postgres-like authorization mechanisms. The integration design includes a Ranger Plugin Service that allows transparent authorization API calls between C-based Apache HAWQ and Java-based Apache Ranger.
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
A stream processing platform is not an island unto itself; it must be connected to all of your existing data systems, applications, and sources. In this talk we will provide different options for integrating systems and applications with Apache Kafka, with a focus on the Kafka Connect framework and the ecosystem of Kafka connectors. We will discuss the intended use cases for Kafka Connect and share our experience and best practices for building large-scale data pipelines using Apache Kafka.
Autoscaling of workloads in the Kubernetes environment. A slidedeck about Pod and Node autoscaling and the machinery behind it that makes it happen. Few recommendations for Pod and Node autoscaling while implementing it.
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisAmazon Web Services
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.
Reasons to attend:
- This session, will provide you with an overview of Amazon Kinesis.
- Learn about sample use cases and real life case studies.
- Learn how Amazon Kinesis can be integrated into your own applications.
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...HostedbyConfluent
Saxo Bank is on a growth journey and Kafka is a critical component to that success. Securing our financial event streams is a top priority for us and initially we started with an on-prem Kafka cluster secured with (the de-facto) Kerberos. However, as we modernize and scale, the demands of hybrid cloud, multiple domains, polyglot computing and Data Mesh require us to also modernize our approach to security. In this talk, we will describe how we took the default (non-production ready) Kafka OAuth implementation and productionized it to work with Kafka in Azure Cloud, including the Kafka stack and clients. By enabling both Kerberos and OAuth running on-prem and in the cloud, we now plan to gracefully retire Kerberos from our estate.
Hello, kafka! (an introduction to apache kafka)Timothy Spann
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
An Introduction to Confluent Cloud: Apache Kafka as a Serviceconfluent
Business breakout during Confluent’s streaming event in Munich, presented by Hans Jespersen, VP WW Systems Engineering at Confluent. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
Scaling up uber's real time data analyticsXiang Fu
Realtime infrastructure powers critical pieces of Uber. This talk will discuss the architecture, technical challenges, learnings and how a blend of open source infrastructure (Apache Kafka/Flink/Pinot) and in-house technologies have helped Uber scale and enabled SQL to power realtime decision making for city ops, data scientists, data analysts and engineers.
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...Amazon Web Services
Apache Kafka is a popular stream-processing platform, but it’s no secret that it can be tough to set up, manage, and scale. Amazon Managed Streaming for Kafka (Amazon MSK) can help remove some of that toil for you. In this session, you learn about new Amazon MSK features and capabilities. You also get a glimpse under the hood, giving you a better understanding of how Amazon MSK operationalizes Apache Kafka so you don't have to. We compare and contrast Amazon Kinesis Data Streams and Apache Kafka (with/without MSK) and show how to lift-and-shift your workload into Amazon MSK with minimal downtime.
Vladimir Rodionov (Hortonworks)
Time-series applications (sensor data, application/system logging events, user interactions etc) present a new set of data storage challenges: very high velocity and very high volume of data. This talk will present the recent development in Apache HBase that make it a good fit for time-series applications.
In this presentation, we show how Data Reply helped an Austrian fintech customer to overcome previous performance limitations in their data analytics landscape, leverage real-time pipelines, break down monoliths, and foster a self-service data culture to enable new event-driven and business-critical use cases.
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:
Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.
LDM Slides: Data Modeling for XML and JSONDATAVERSITY
Data modeling has traditionally focused on relational database systems. But in the age of the internet, technologies such as XML and JSON have evolved to provide structure and definition to “data in motion”. Have data modeling technologies evolved to support these technologies? Can we use traditional approaches to model data in XML and JSON? Or are new tools and methodologies required? Join this webinar to discuss:
- XML & JSON vs. Relational Database Modeling
- Techniques & Tools for Data Modeling for XML
- Techniques & Tools for Data Modeling for JSON
- Use Cases & Opportunities for XML and JSON Data Modeling
Confluent Cloud Networking | Rajan Sundaram, ConfluentHostedbyConfluent
Introduction to networking options available in Confluent Cloud Self Serve provisioning of confluent Kafka clusters. VPC Peering, VNet Peering, Transit Gateway and Private Link Options for AWS, GCP, Azure networking offering. Caveats of confluent's cloud networking solutions customers should be aware of. Details of two major pieces of the architecture of Confluent Cloud - Data Plane Network and Control Plane.
Azure and Kubernetes go together like peanut butter and jelly with Azure offering many options to host Kubernetes. In this session, we'll show you how to mix the Open Source tools you already use with the powerful Kubernetes hosting options on Azure. Take your deployment and orchestration to the next level!
Ranger’s pluggable architecture allows resource access policy administration and enforcement for standard and custom services from a “single pane of glass”. Apache Ranger has a rich Authorization Model, which provides the mechanism to author Policy in a Ranger Admin Server and serves as policy decision and audit point in authorizing user’s resource access within various components of Hadoop ecosystem.
This session will provide a deep dive into Ranger framework and a cook-book for extending Ranger to do authorization / auditing on resource access to external applications, including technical details of Rest APIs, Ranger policy engine and enriching authorization requests, with a demo of a sample application.We will then demonstrate a real-world example of how Ranger has simplified security enforcement for Hadoop-native MPP SQL engine like Apache HAWQ (incubating),which previously used its built-in Postgres-like authorization mechanisms. The integration design includes a Ranger Plugin Service that allows transparent authorization API calls between C-based Apache HAWQ and Java-based Apache Ranger.
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
A stream processing platform is not an island unto itself; it must be connected to all of your existing data systems, applications, and sources. In this talk we will provide different options for integrating systems and applications with Apache Kafka, with a focus on the Kafka Connect framework and the ecosystem of Kafka connectors. We will discuss the intended use cases for Kafka Connect and share our experience and best practices for building large-scale data pipelines using Apache Kafka.
Autoscaling of workloads in the Kubernetes environment. A slidedeck about Pod and Node autoscaling and the machinery behind it that makes it happen. Few recommendations for Pod and Node autoscaling while implementing it.
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisAmazon Web Services
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.
Reasons to attend:
- This session, will provide you with an overview of Amazon Kinesis.
- Learn about sample use cases and real life case studies.
- Learn how Amazon Kinesis can be integrated into your own applications.
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...HostedbyConfluent
Saxo Bank is on a growth journey and Kafka is a critical component to that success. Securing our financial event streams is a top priority for us and initially we started with an on-prem Kafka cluster secured with (the de-facto) Kerberos. However, as we modernize and scale, the demands of hybrid cloud, multiple domains, polyglot computing and Data Mesh require us to also modernize our approach to security. In this talk, we will describe how we took the default (non-production ready) Kafka OAuth implementation and productionized it to work with Kafka in Azure Cloud, including the Kafka stack and clients. By enabling both Kerberos and OAuth running on-prem and in the cloud, we now plan to gracefully retire Kerberos from our estate.
Hello, kafka! (an introduction to apache kafka)Timothy Spann
Hello ApacheKafka
An Introduction to Apache Kafka with Timothy Spann and Carolyn Duby Cloudera Principal engineers.
We also demo Flink SQL, SMM, SSB, Schema Registry, Apache Kafka, Apache NiFi and Public Cloud - AWS.
An Introduction to Confluent Cloud: Apache Kafka as a Serviceconfluent
Business breakout during Confluent’s streaming event in Munich, presented by Hans Jespersen, VP WW Systems Engineering at Confluent. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
Scaling up uber's real time data analyticsXiang Fu
Realtime infrastructure powers critical pieces of Uber. This talk will discuss the architecture, technical challenges, learnings and how a blend of open source infrastructure (Apache Kafka/Flink/Pinot) and in-house technologies have helped Uber scale and enabled SQL to power realtime decision making for city ops, data scientists, data analysts and engineers.
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...Amazon Web Services
Apache Kafka is a popular stream-processing platform, but it’s no secret that it can be tough to set up, manage, and scale. Amazon Managed Streaming for Kafka (Amazon MSK) can help remove some of that toil for you. In this session, you learn about new Amazon MSK features and capabilities. You also get a glimpse under the hood, giving you a better understanding of how Amazon MSK operationalizes Apache Kafka so you don't have to. We compare and contrast Amazon Kinesis Data Streams and Apache Kafka (with/without MSK) and show how to lift-and-shift your workload into Amazon MSK with minimal downtime.
Vladimir Rodionov (Hortonworks)
Time-series applications (sensor data, application/system logging events, user interactions etc) present a new set of data storage challenges: very high velocity and very high volume of data. This talk will present the recent development in Apache HBase that make it a good fit for time-series applications.
In this presentation, we show how Data Reply helped an Austrian fintech customer to overcome previous performance limitations in their data analytics landscape, leverage real-time pipelines, break down monoliths, and foster a self-service data culture to enable new event-driven and business-critical use cases.
(NET303) Optimizing Your Cloud Architecture With Network StrategyAmazon Web Services
In this session, explore three benefits of private, dedicated network connections to AWS. Learn how you can transport business-critical data directly from your data center, office, or colocation environment into and from AWS over dedicated network connections. Discover how to dynamically scale your bandwidth up to 300 percent, only paying for what you use, and how to use dynamic scaling to speed up backups, temporary or scheduled workloads, moving from test to live production, and new product launches. Also, learn how to use private network connectivity to help build hybrid environments in situations where security and compliance are critical. Hybrid environments let you extend your private on-premises infrastructure with the elasticity and economic benefits of AWS. Session sponsored by Level 3.
In this session, learn how you evaluate, design, build, and manage distributed applications over hybrid infrastructures using Amazon Web Services. This session follows the evolution of a simple legacy data center expansion with basic connectivity into managing complex hybrid applications. Along the way, we investigate best practice designs in use by AWS customers. Topics covered include interconnectivity, availability, security, and hybrid networks with Amazon VPC and AWS Direct Connect, as well as automated provisioning with AWS CloudFormation and configuration management with AWS OpsWorks.
Kaleido Platform Overview and Full-stack Blockchain ServicesPeter Broadhurst
Overview of the Kaleido Platform, and one-slide summaries of the Kaleido services.
Learn more about our full-stack services at:
https://marketplace.kaleido.io
Get started today at:
https://console.kaleido.io
Access our docs at:
https://docs.kaleido.io
(NET208) Enable & Secure Your Business Apps via the Hybrid Cloud on AWSAmazon Web Services
Learn how to enable and support data migrations in AWS and keep your business applications highly secure, whether you are migrating your IT infrastructure to the cloud, migrating your business applications to the cloud, or simply moving traffic on AWS between different Availability Zones. Our real-world use cases include securing your critical business applications in AWS by deploying vSRX as a perimeter firewall for VPC instances, and enabling secure transport and routing for hybrid cloud deployments using IPSec VPNs on vMX. Session sponsored by Juniper Networks.
Bridge to Cloud: Using Apache Kafka to Migrate to AWSconfluent
Watch this talk here: https://www.confluent.io/online-talks/bridge-to-cloud-apache-kafka-migrate-aws
Speakers: Priya Shivakumar, Director of Product, Confluent + Konstantine Karantasis, Software Engineer, Confluent + Rohit Pujari, Partner Solutions Architect, AWS
Most companies start their cloud journey with a new use case, or a new application. Sometimes these applications can run independently in the cloud, but often times they need data from the on premises datacenter. Existing applications will slowly migrate, but will need a strategy and the technology to enable a multi-year migration.
In this session, we will share how companies around the world are using Confluent Cloud, a fully managed Apache Kafka service, to migrate to AWS. By implementing a central-pipeline architecture using Apache Kafka to sync on-prem and cloud deployments, companies can accelerate migration times and reduce costs.
In this online talk we will cover:
•How to take the first step in migrating to AWS
•How to reliably sync your on premises applications using a persistent bridge to cloud
•Learn how Confluent Cloud can make this daunting task simple, reliable and performant
•See a demo of the hybrid-cloud and multi-region deployment of Apache Kafka
How a National Transportation Software Provider Migrated a Mission-Critical T...Amazon Web Services
In this webinar, Cascadeo will show you how they helped a national transportation software provider build an AWS architecture that enables them to effectively support more than 3,300 complex integration tests against nightly builds of their Interoperable Train Control Messaging (ITCM) application. You’ll also learn about how this software provider can scale on-demand, has improved governance and cost management, and rapidly supports new projects without increasing IT overhead using AWS.
As cloud computing continues to gain popularity, companies that are natively Windows question if they too can leverage AWS. Learn about the benefits the cloud provides, best practices of cloud computing services, and solutions available on AWS for Windows workloads. Learn how Covanta is delivering services to its users 90% faster and saving more than 60% in IT infrastructure costs after migrating its Windows workloads to the cloud.
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
In our exclusive webinar, you'll learn why event-driven architecture is the key to unlocking cost efficiency, operational effectiveness, and profitability. Gain insights on how this approach differs from API-driven methods and why it's essential for your organization's success.
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
In today's data-driven world, the Internet of Things (IoT) is revolutionizing industries and unlocking new possibilities. Join Data Reply, Confluent, and Imply as we unveil a comprehensive solution for IoT that harnesses the power of real-time insights.
Workshop híbrido: Stream Processing con Flinkconfluent
El Stream processing es un requisito previo de la pila de data streaming, que impulsa aplicaciones y pipelines en tiempo real.
Permite una mayor portabilidad de datos, una utilización optimizada de recursos y una mejor experiencia del cliente al procesar flujos de datos en tiempo real.
En nuestro taller práctico híbrido, aprenderás cómo filtrar, unir y enriquecer fácilmente datos en tiempo real dentro de Confluent Cloud utilizando nuestro servicio Flink sin servidor.
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
Our talk will explore the transformative impact of integrating Confluent, HiveMQ, and SparkPlug in Industry 4.0, emphasizing the creation of a Unified Namespace.
In addition to the creation of a Unified Namespace, our webinar will also delve into Stream Governance and Scaling, highlighting how these aspects are crucial for managing complex data flows and ensuring robust, scalable IIoT-Platforms.
You will learn how to ensure data accuracy and reliability, expand your data processing capabilities, and optimize your data management processes.
Don't miss out on this opportunity to learn from industry experts and take your business to the next level.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Eventos y Microservicios - Santander TechTalkconfluent
Durante esta sesión examinaremos cómo el mundo de los eventos y los microservicios se complementan y mejoran explorando cómo los patrones basados en eventos nos permiten descomponer monolitos de manera escalable, resiliente y desacoplada.
Purpose of the session is to have a dive into Apache, Kafka, Data Streaming and Kafka in the cloud
- Dive into Apache Kafka
- Data Streaming
- Kafka in the cloud
Build real-time streaming data pipelines to AWS with Confluentconfluent
Traditional data pipelines often face scalability issues and challenges related to cost, their monolithic design, and reliance on batch data processing. They also typically operate under the premise that all data needs to be stored in a single centralized data source before it's put to practical use. Confluent Cloud on Amazon Web Services (AWS) provides a fully managed cloud-native platform that helps you simplify the way you build real-time data flows using streaming data pipelines and Apache Kafka.
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
Microservices have become a dominant architectural paradigm for building systems in the enterprise, but they are not without their tradeoffs. Learn how to build event-driven microservices with Apache Kafka
Confluent & GSI Webinars series - Session 3confluent
An in depth look at how Confluent is being used in the financial services industry. Gain an understanding of how organisations are utilising data in motion to solve common problems and gain benefits from their real time data capabilities.
It will look more deeply into some specific use cases and show how Confluent technology is used to manage costs and mitigate risks.
This session is aimed at Solutions Architects, Sales Engineers and Pre Sales, and also the more technically minded business aligned people. Whilst this is not a deeply technical session, a level of knowledge around Kafka would be helpful.
Transforming applications built with traditional messaging solutions such as TIBCO, MQ and Solace to be scalable, reliable and ready for the move to cloud
How can applications built with traditional messaging technologies like TIBCO, Solace and IBM MQ be modernised and be made cloud ready? What are the advantages to Event Streaming approaches to pub/sub vs traditional message queues? What are the strengeths and weaknesses of both approaches, and what use cases and requirements are actually a better fit for messaging than Kafka?
This session will show why the old paradigm does not work and that a new approach to the data strategy needs to be taken. It aims to show how a Data Streaming Platform is integral to the evolution of a company’s data strategy and how Confluent is not just an integration layer but the central nervous system for an organisation
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
Confluent Partner Tech Talk with Synthesisconfluent
A discussion on the arduous planning process, and deep dive into the design/architectural decisions.
Learn more about the networking, RBAC strategies, the automation, and the deployment plan.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Why React Native as a Strategic Advantage for Startup Innovation.pdfayushiqss
Do you know that React Native is being increasingly adopted by startups as well as big companies in the mobile app development industry? Big names like Facebook, Instagram, and Pinterest have already integrated this robust open-source framework.
In fact, according to a report by Statista, the number of React Native developers has been steadily increasing over the years, reaching an estimated 1.9 million by the end of 2024. This means that the demand for this framework in the job market has been growing making it a valuable skill.
But what makes React Native so popular for mobile application development? It offers excellent cross-platform capabilities among other benefits. This way, with React Native, developers can write code once and run it on both iOS and Android devices thus saving time and resources leading to shorter development cycles hence faster time-to-market for your app.
Let’s take the example of a startup, which wanted to release their app on both iOS and Android at once. Through the use of React Native they managed to create an app and bring it into the market within a very short period. This helped them gain an advantage over their competitors because they had access to a large user base who were able to generate revenue quickly for them.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
4. @yourtwitterhandle | developer.confluent.io
Our Partner Technical Enablement offering
Scheduled sessions On-demand
Join us for these live sessions
where our experts will guide you
through sessions of different level
and will be available to answer
your questions. Some examples of
sessions are below:
• Confluent 101: for new starters
• Hybrid Cloud Workshop:
learn by doing
• Path to Production series ,
Confluent Cloud workshops
series
• Product Updates
Learn the basics with a guided
experience, at your own pace with
our learning paths on-demand. You
will also find an always growing
repository of more advanced
presentations to dig-deeper. Some
examples are below:
• Aware/Novice/Competent
Learning paths
• Confluent Use Cases
• Positioning Confluent Value
• Confluent Cloud Networking
• … and many more
AskTheExpert
we’ll offer a channel dedicated to
streaming questions
• Build CoE inside partners by
getting people with similar
interest together
• Connect with opportunities
and discover trends at focus
partners
• Build a Technical Community
• Q&A
• Tech Talk
10. The Confluent Q3 ‘23 Launch
Deliver Intelligent, Secure, and Cost-effective Data Pipelines
10
Cloud-Native Complete Everywhere
Storage Price Reduction: Cost-effectively store data at any scale without growing compute at 20% lower prices
CC for Apache Flink®
(Open Preview)
+
Enterprise Clusters
Secure, cost-effective, and serverless Kafka
powered by the Kora Engine
Confluent Terraform Provider updates
+
Enhance security and compliance while
continuing to reduce operational burden
through automated infrastructure
management
HashiCorp
Sentinel
Integration
Resource
Importer
Data
Catalog
Support
Cloud Audit Logs for Kafka Produce
& Consume
Experience full visibility and control of
sensitive data access in Confluent Cloud with
detailed audit events enabling swift response
to unauthorized access.
Cluster Linking updates
Cluster Linking with AWS Private Link:
Easily stream data between regions, teams or
environments within AWS private networks
Bi-directional Cluster Linking Optimize
disaster recovery and increase reliability with
bi-directional cluster linking
Data Portal in
Stream Governance
Safely unlock data and increase developer
productivity with a self-service, data-centric
portal for discovering, accessing, and
enriching real-time data streams flowing
across your organization
(coming soon)
Easily build high-quality, reusable data streams with the industry’s only cloud-native, serverless Flink
service
11. Data Portal in Stream
Governance
11
Seamlessly and securely request
access to data streams and trigger an
approval workflow that connects the
user with the data owner, all within the
Confluent Cloud UI
Easily build and manage data products
to power streaming pipelines and
applications by understanding,
accessing, and enriching existing data
streams
Complete
Safely unlock data and increase
developer productivity with a
self-service, data-centric portal for
discovering, accessing, and enriching
real-time data streams flowing across
your organization
Search, discover, and explore existing
topics, tags, and metadata across the
organization with end-to-end visibility to
choose the data most relevant for your
projects
Coming Soon
12. Introducing Data Portal in Stream Governance
Access your data streams through a developer-friendly, self-service UI
Search, discover, and
explore existing topics,
tags, and metadata
across the organization
Seamlessly request
access to data streams
and trigger an approval
workflow
Understand, access, & enrich
data streams to power
real-time data streaming
pipelines and applications
13. Bidirectional Cluster
Linking
13
Optimize disaster recovery and
increase reliability with bi-directional
cluster linking
Facilitate seamless consumer
migration with retained offsets for
consistent data processing with
Bi-directional cluster links
Increase efficiency and reduce data
recovery time by eliminating the need
for custom code
Streamline security configuration with
support for DR and active/active
architecture with Bi-directional links
that provides outbound and inbound
connections
Everywhere
**Note - bi-directional cluster linking is available for new cluster links only,
existing cluster link need to be deleted and re-activated to obtain this
functionality.
14. Enhanced Disaster Recovery Capabilities with
Bidirectional Cluster Linking
14
Cluster Link
bidirectional
Connection and Authentication
Connection and Authentication
Cluster A Cluster B
Applications
in region B
Cluster A Cluster B
Cluster Link
bidirectional
Topics on
Cluster A
Mirror
Topics on
Cluster B
Mirror Topics
on Cluster A
Topics on
Cluster B
ACLs / RBAC for Cluster
B
API Key or OAuth for Cluster
A
API Key or OAuth for Cluster B
ACLs / RBAC for Cluster A
Applications
in region A
Data &
Metadata
Data &
Metadata
15. Cluster Linking with
AWS Private Link
15
Simplified setup: Utilize Network Link
Service and Endpoint for a reliable
connection between clusters
Enhanced network-level security: AWS
PrivateLink isolates Confluent Cloud
clusters, preventing external resources
and Cluster Linking access
Seamless cluster linking: Establish a
secure networking path between
separate Confluent Cloud networks for
efficient data exchange
Everywhere
Easily stream data between regions,
teams or environments within AWS
private networks
16. The Confluent Q3 ‘23 Launch
Deliver Intelligent, Secure, and Cost-effective Data Pipelines
Cloud-Native Complete Everywhere
Storage Price Reduction: Cost-effectively store data at any scale without growing compute at 20% lower prices
Easily build high-quality, reusable data streams with the industry’s only cloud-native, serverless Flink
service
Apache Flink® on CC
(Open Preview)
+
Enterprise Clusters
Secure, cost-effective, and serverless Kafka
powered by the Kora Engine
Confluent Terraform Provider updates
+
Enhance security and compliance while
continuing to reduce operational burden
through automated infrastructure
management
HashiCorp
Sentinel
Integration
Resource
Importer
Data
Catalog
Support
Cloud Audit Logs for Kafka Produce
& Consume
Experience full visibility and control of
sensitive data access in Confluent Cloud with
detailed audit events enabling swift response
to unauthorized access.
Cluster Linking updates
Cluster Linking with AWS Private Link:
Easily stream data between regions, teams or
environments within AWS private networks
Bi-directional Cluster Linking Optimize
disaster recovery and increase reliability with
bi-directional cluster linking
Data Portal in
Stream Governance
Safely unlock data and increase developer
productivity with a self-service, data-centric
portal for discovering, accessing, and
enriching real-time data streams flowing
across your organization
(coming soon)
20. “A service mesh is a tool for adding observability, security,
and reliability features to “cloud native” applications by
transparently inserting this functionality at the platform
layer rather than the application layer. The service mesh is
rapidly becoming a standard part of the cloud native stack,
especially for Kubernetes adopters.”
20
-linkerd.io
21. “A service mesh is a tool for adding observability, security,
and reliability features to “cloud native” applications by
transparently inserting this functionality at the platform
layer rather than the application layer. The service mesh is
rapidly becoming a standard part of the cloud native stack,
especially for Kubernetes adopters.”
21
-linkerd.io
22. “A service mesh is a tool for adding observability, security,
and reliability features to “cloud native” applications by
transparently inserting this functionality at the platform
layer rather than the application layer. The service mesh is
rapidly becoming a standard part of the cloud native stack,
especially for Kubernetes adopters.”
22
-linkerd.io
31. End-to-end Encryption Features
• Local key management and JKS support
• Gemalto, Hashicorp, many security appliances
• Cloud provider key management service support
• AES, RSA encryption, SHA256 hashing
• AVRO, JSON, Protobuf, XML, String, Byte arrays,
Byte buffer level encryption and tokenization
• Field access control
• Format preserving encryption (NIST SP 800-38G)
• Support for metadata and data classification
• Support for master keys (Encryption of a data key
with a wrapping key)
• Support for key rotation
• Support for event digital signature support to
validate producers
Consumer
Protected
Producer
KMS/Tokenizer
Schema
Registry
35. Key Exchange Process
Kafka
Broker
Key
Store/KMS
Get Master Key
Key
Store/KMS
Encryption
Decryption
Get Data Key
Secured
Serializer
Encrypt Event
Encrypt Data Key
Send encrypted event and encrypted data key
Encryption
Decryption
Secured
Deserializer
Fetch Events
Get Master Key
Decrypt Data Key
Decrypt Event
Use decrypted data
key for decryption
Use data key for
encryption
Use master key for
decryption
Use master key
for encryption
36. Data Protection
with Confluent
Service Mesh
and Encryption
accelerator
36
CSM producer sidecar is
responsible for data
protection independently
of the client type.
Protected
Producer Consumer
KMS/Tokenizer
CSM consumer sidecar is
responsible for safely
exposing data in clear and
can also handle field
access control.
CSM CSM
39. Data Protection with Access Control via CSM
Original message
Original message
{
"name": "Joe Example",
"address": "123 Main St",
"ssn_id": "123-45-6789",
"account": "678900000234",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
}
{
"name": "Hyt Piqdfggr",
"address": "852 Jdrf Wd",
"ssn_id": "dKI4gflV6r339Q==",
"account": "PrM1vyf/CxwoqQ==",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
}
Protected
{
"name": "Joe Example",
"address": "123 Main St",
"ssn_id": "123-45-6789",
"account": "678900000234",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
}
{
"name": "Joe Example",
"address": "123 Main St",
"ssn_id": "dKI4gflV6r339Q==",
"account": "PrM1vyf/CxwoqQ==",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
}
Original message
with Access Control
40. OPA - Open Policy Agent
https://www.openpolicyagent.org/
OPA testing and examples: The Rego Playground
41. Policy Based Field Level Access Control
Which fields
should be
hidden or
redacted?
Producer Consumer
Open Policy Agent
Pluggable
Code
Confluent Service
Mesh
Pluggable
Code
Confluent Service
Mesh
42. USA
financial
Policy Based Field Level Access Control
Original message
{
"name": "Joe Example",
"address": "123 Main St",
"ssn_id": "123-45-6789",
"account": "678900000234",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
“country”: “usa”
}
{
"account": "678900000234",
"Order_time": 1560070133853,
"itemid": "Item_9"
}
{
"name": "Joe Example",
"address": "123 Main St",
"ssn_id": "123-45-6789",
"account": "678900000234",
"Order_time": 1560070133853,
"current_balance": 67,
"itemid": "Item_9"
}
USA
financial
pii
Brazil
financial
pii
Open Policy
Agent
nothing sent
Pluggable
Code
Confluent
Service Mesh
44. OPA Configuration and Integration
Link OPA Policies in Classifications
Add OPA Policies (rego)
Local OPA module (Session Authorizer)
local path to rego file
rego path (decision,
package)
46. Mutual TLS (mTLS) or Kerberos
Producer Consumer
MTLS /
Kerberos
MTLS /
Kerberos
O
N
PREM
O
N
LY
🤬
FAIL
47. With CSM in the Mix
Client
Pluggable
Code
CSM
MTLS
principal
User1 => key/secret
User2 => key/secret
SASL
(key/secret
)
Lookup Auth from Principal
during
SSL
H
andshake
48. Example CSM MTLS Flow
Extract Principal
from Cert
Some
Database
CSM
SSL Handshake
Client
Lookup key/secret
from DB with Principal
as key
Return key/secret
Confluent
Cloud
Authenticate sasl
with key/secret
Finish Handshake
51. Typical Hybrid
CSM-Setup
- hybrid setup
- self-managed connect
- local CSM and clients
- ksqlDB and CP in
Confluent Cloud
- ksqlDB on
field-level-encrypted
topics
- AWS KMS for keys (AWS,
Azure, Vault, …)
52. CSM in a sidecar
- external service writing
to plain-text topic
- kstreams app filtering
data and writing to
encrypted topic
- local client connecting to
CCloud via CSM/directly
53. CSM as (Gateway)
Service on VMs
- CSM deployed on
containers/VMs
- HA achieved with
multiple CSM-replicas
and LB
- reminder: CSM is
stateless (!)
- Scaling
horizontally/vertically
- load-balancers for
external CSM-access
58. CSM as a Gateway
to Confluent Cloud
Transparent
end-to-end
encryption
Field-level
authorization and
access-control with
policy-based
field-level
encryption
Use existing
authentication
mechanisms in
cloud migrations
61. CSM Ingress on k8s / SNI:
Formatter for Listener Overrides
62. Use case: Kubernetes Ingress
Ingress Scenario:
● CSM maps each broker to one port
that is exposed as a k8s service
● Ingress will not allow to open ports
dynamically (or more than a few
specific ports at all - 80, 8080, 443)
64. Solution: SNI Routing
SNI: Server Name Indication - Wikipedia
(https://github.com/Schm1tz1/sni-routing-examples)
● Hosting of multiple (virtual) services
with same (physical) frontend and
different backends
● Used in Ingress for (de)multiplexing
TCP traffic
● Routing to backend services using
information from TLS handshake
(hello)
● Similar pattern based on HTTP
headers very common in for
Web-Servers
65. Formatter for Listener Overrides and SNI
Changes to "CSM standard setup":
● CSM configured to return virtual
hostnames that can be mapped
back to internal ports (example:
host.name.formatter=b$p.$h:9092)
● Matching Certificates (wildcard)
● Ingress with SNI rules / mapping for
these hostnames
● External DNS entries (wildcard)
pointing to ingress IPs
67. Features Comparison
Client-side Encryption CSM-based Encryption
Field-level encryption ✅ (Java,.NET only) ✅
Payload-level encryption ✅ ✅
Tokenization/Masking ✅ (Java,.NET only) ✅
Format-Preserving Encryption ✅ (Java,.NET only) ✅
Supports Kafka Streams ✅ ✅
Supports Kafka Connect JSON, AVRO only ✅
Supports ksqlDB ✅ ✅
Supports REST Proxy ❌ ✅
Popular KMS integrations ✅ (Java,.NET only) ✅
Supports access control ✅ ✅
Node.js, python, C++ support limited features ✅
Other (Go, Ruby) lang support ❌ ✅
Component-based install ✅ Not required
68. E2EE Libraries
Features and integrations
✅ Feature
included
❌ Feature
prioritized but
not complete
❌ Feature
not included
or prioritized
na Not
Applicable