Video recording of this presentation:
https://youtu.be/upWzamacOVQ
Blog post with more details:
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Mainframes are still hard at work, processing over 70 percent of the world’s most essential computing transactions every day. Very high cost, monolithic architectures, and missing experts are the key challenges for mainframe applications. Time to get more innovative, even with the mainframe!
Mainframe offloading with Apache Kafka and its ecosystem can be used to keep a more modern data store in real-time sync with the mainframe. At the same time, it is persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.
But the final goal and ultimate vision are to replace the mainframe by new applications using modern and less costly technologies. Stand up to the dinosaur, but keep in mind that legacy migration is a journey! Kai will guide you to the next step of your company’s evolution!
You will learn:
- how to not only reduce operational expenses but provide a path for architecture modernization, agility and eventually mainframe replacement
- what steps some of Confluent’s customers already took, leveraging technologies like Change Data Capture (CDC) or MQ for mainframe offloading
- how an event streaming platform enables cost reduction, architecture modernization, and a combination of a mainframe with new technologies
Apache Kafka in the Airline, Aviation and Travel IndustryKai Wähner
Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. Coronavirus is just a piece of the challenge.
This presentation explores use cases, architectures, and references for Apache Kafka as event streaming technology in the aviation industry, including airline, airports, global distribution systems (GDS), aircraft manufacturers, and more.
Examples include Lufthansa, Singapore Airlines, Air France Hop, Amadeus, and more. Technologies include Kafka, Kafka Connect, Kafka Streams, ksqlDB, Machine Learning, Cloud, and more.
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureKai Wähner
AWS Data Lake / Lake House + Confluent Cloud for Serverless Apache Kafka. Learn about use cases, architectures, and features.
Data must be continuously collected, processed, and reactively used in applications across the entire enterprise - some in real time, some in batch mode. In other words: As an enterprise becomes increasingly software-defined, it needs a data platform designed primarily for "data in motion" rather than "data at rest."
Apache Kafka is now mainstream when it comes to data in motion! The Kafka API has become the de facto standard for event-driven architectures and event streaming. Unfortunately, the cost of running it yourself is very often too expensive when you add factors like scaling, administration, support, security, creating connectors...and everything else that goes with it. Resources in enterprises are scarce: this applies to both the best team members and the budget.
The cloud - as we all know - offers the perfect solution to such challenges.
Most likely, fully-managed cloud services such as AWS S3, DynamoDB or Redshift are already in use. Now it is time to implement "fully-managed" for Kafka as well - with Confluent Cloud on AWS.
Building a central integration layer that doesn't care where or how much data is coming from.
Implementing scalable data stream processing to gain real-time insights
Leveraging fully managed connectors (like S3, Redshift, Kinesis, MongoDB Atlas & more) to quickly access data
Confluent Cloud in action? Let's show how ao.com made it happen!
Translated with www.DeepL.com/Translator (free version)
Kafka and Machine Learning in Banking and Insurance IndustryKai Wähner
Streaming Machine Learning and Apache Kafka for real-time analytics-The Next Generation of Intelligent Software for Financial Services and Insurance Industries.
The slides cover use cases, architectures, and examples from various companies. Learn about Kafka + Machine Learning / Deep Learning for fraud detection and other use cases.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
The rise of data in motion in the insurance industry is visible across all lines of business including life, healthcare, travel, vehicle, and others. Apache Kafka changes how enterprises rethink data. This blog post explores use cases and architectures for event streaming. Real-world examples from Generali, Centene, Humana, and Telsa show innovative insurance-related data integration and stream processing in real-time.
The Rise of Data in Motion in the Healthcare Industry - Use Cases, Architectures and Examples powered by Apache Kafka.
Use Cases for Data in Motion in the Healthcare Industry:
- Know Your Patient (= “Customer 360”)
- Operations (Healthcare 4.0 including Drug R&D, Patient Care, etc.)
- IT Perspective (Cybersecurity, Mainframe Offload, Hybrid Cloud, Streaming ETL, etc)
Real-world examples include Covid-19 Electronic Lab Reporting, Cerner, Optum, Centene, Humana, Invitae, Bayer, Celmatix, Care.com.
Apache Kafka® Use Cases for Financial Servicesconfluent
Traditional systems were designed in an era that predates large-scale distributed systems. These systems often lack the ability to scale to meet the needs of the modern data-driven organisation. Adding to this is the accumulation of technologies and the explosion of data which can result in complex point-to-point integrations where data becomes siloed or separated across the enterprise.
The demand for fast results and decision making, have generated the need for real-time event streaming and processing of data adoption in financial institutions to be on the competitive edge. Apache Kafka and the Confluent Platform are designed to solve the problems associated with traditional systems and provide a modern, distributed architecture and Real-time Data streaming capability. In addition these technologies open up a range of use cases for Financial Services organisations, many of which will be explored in this talk. .
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
Connect all the things: An intro to event streaming for the automotive industry including connected cars, mobility services, and manufacturing / industrial IoT.
Video recording of this talk: https://www.youtube.com/watch?v=rBfBFrcO-WU
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology. Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way using integrating with various legacy and modern data sources and sinks.
Other industries—retail, healthcare, government, financial services, energy, and more—also lean into Industry 4.0 technology to take advantage of IoT devices, sensors, smart machines, robotics, and connected data. The variety of these deployments goes from disconnected edge use cases across hybrid architectures to global multi-cloud deployments.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
- The Automotive Industry (and it’s not only Connected Cars)
- Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
- Smart Cities (including citizen health services, communication infrastructure, …)
Real-world examples include use cases from car makers such as Audi, BMW, Porsche, Tesla, plus many examples from mobility services such as Uber, Lyft, Here Technologies, and more.
Apache Kafka in the Airline, Aviation and Travel IndustryKai Wähner
Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. Coronavirus is just a piece of the challenge.
This presentation explores use cases, architectures, and references for Apache Kafka as event streaming technology in the aviation industry, including airline, airports, global distribution systems (GDS), aircraft manufacturers, and more.
Examples include Lufthansa, Singapore Airlines, Air France Hop, Amadeus, and more. Technologies include Kafka, Kafka Connect, Kafka Streams, ksqlDB, Machine Learning, Cloud, and more.
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureKai Wähner
AWS Data Lake / Lake House + Confluent Cloud for Serverless Apache Kafka. Learn about use cases, architectures, and features.
Data must be continuously collected, processed, and reactively used in applications across the entire enterprise - some in real time, some in batch mode. In other words: As an enterprise becomes increasingly software-defined, it needs a data platform designed primarily for "data in motion" rather than "data at rest."
Apache Kafka is now mainstream when it comes to data in motion! The Kafka API has become the de facto standard for event-driven architectures and event streaming. Unfortunately, the cost of running it yourself is very often too expensive when you add factors like scaling, administration, support, security, creating connectors...and everything else that goes with it. Resources in enterprises are scarce: this applies to both the best team members and the budget.
The cloud - as we all know - offers the perfect solution to such challenges.
Most likely, fully-managed cloud services such as AWS S3, DynamoDB or Redshift are already in use. Now it is time to implement "fully-managed" for Kafka as well - with Confluent Cloud on AWS.
Building a central integration layer that doesn't care where or how much data is coming from.
Implementing scalable data stream processing to gain real-time insights
Leveraging fully managed connectors (like S3, Redshift, Kinesis, MongoDB Atlas & more) to quickly access data
Confluent Cloud in action? Let's show how ao.com made it happen!
Translated with www.DeepL.com/Translator (free version)
Kafka and Machine Learning in Banking and Insurance IndustryKai Wähner
Streaming Machine Learning and Apache Kafka for real-time analytics-The Next Generation of Intelligent Software for Financial Services and Insurance Industries.
The slides cover use cases, architectures, and examples from various companies. Learn about Kafka + Machine Learning / Deep Learning for fraud detection and other use cases.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
The rise of data in motion in the insurance industry is visible across all lines of business including life, healthcare, travel, vehicle, and others. Apache Kafka changes how enterprises rethink data. This blog post explores use cases and architectures for event streaming. Real-world examples from Generali, Centene, Humana, and Telsa show innovative insurance-related data integration and stream processing in real-time.
The Rise of Data in Motion in the Healthcare Industry - Use Cases, Architectures and Examples powered by Apache Kafka.
Use Cases for Data in Motion in the Healthcare Industry:
- Know Your Patient (= “Customer 360”)
- Operations (Healthcare 4.0 including Drug R&D, Patient Care, etc.)
- IT Perspective (Cybersecurity, Mainframe Offload, Hybrid Cloud, Streaming ETL, etc)
Real-world examples include Covid-19 Electronic Lab Reporting, Cerner, Optum, Centene, Humana, Invitae, Bayer, Celmatix, Care.com.
Apache Kafka® Use Cases for Financial Servicesconfluent
Traditional systems were designed in an era that predates large-scale distributed systems. These systems often lack the ability to scale to meet the needs of the modern data-driven organisation. Adding to this is the accumulation of technologies and the explosion of data which can result in complex point-to-point integrations where data becomes siloed or separated across the enterprise.
The demand for fast results and decision making, have generated the need for real-time event streaming and processing of data adoption in financial institutions to be on the competitive edge. Apache Kafka and the Confluent Platform are designed to solve the problems associated with traditional systems and provide a modern, distributed architecture and Real-time Data streaming capability. In addition these technologies open up a range of use cases for Financial Services organisations, many of which will be explored in this talk. .
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
Connect all the things: An intro to event streaming for the automotive industry including connected cars, mobility services, and manufacturing / industrial IoT.
Video recording of this talk: https://www.youtube.com/watch?v=rBfBFrcO-WU
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology. Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way using integrating with various legacy and modern data sources and sinks.
Other industries—retail, healthcare, government, financial services, energy, and more—also lean into Industry 4.0 technology to take advantage of IoT devices, sensors, smart machines, robotics, and connected data. The variety of these deployments goes from disconnected edge use cases across hybrid architectures to global multi-cloud deployments.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
- The Automotive Industry (and it’s not only Connected Cars)
- Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
- Smart Cities (including citizen health services, communication infrastructure, …)
Real-world examples include use cases from car makers such as Audi, BMW, Porsche, Tesla, plus many examples from mobility services such as Uber, Lyft, Here Technologies, and more.
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:
Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.
Basics of Kafka and IBM Cloud Event Streams. Includes all the major topics of Kafka, like Brokers, Clusters, Topics, Partitions, Producers, Consumers, Streams, and Connectors. What Event Stream offers more than just Kafka. Some difference between Kafka and IBM MQ.
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
The Rise of Data in Motion powered by Event Streaming - Use Cases and Architecture for IBM Cloud Pak with Confluent Platform. Including screenshots of the live demo (integration between IBM and Kafka via Confluent Platform and Kafka Connect connectors).
Learn about the integration capabilities of IBM Cloud Pak for Integration, now with the industry’s leading event streaming platform from Confluent Platform powered by Apache Kafka.
Moving from an on-premises environment into AWS is just the start of the journey towards cost optimisation. In this session we’ll look at a range of ways in which our customers can understand their costs and increase their return-on-investment: building the business case; selecting the right models for the right workloads; benefiting from tiered pricing aggregation; using data to drive the choice of AWS services; implementation of intelligent auto-scaling; and, where appropriate, re-platforming to make use of new architectural patterns such as Serverless.
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
The Rise of Data in Motion in the Public Sector powered by event streaming with Apache Kafka.
Citizen Services:
- Health services, e.g. hospital modernization, track & trace - Covid distance control
- Public administration - reduce bureaucracy, data democratization across government departments
- eGovernment - Efficient and digital citizen engagement, e.g. personal ID application process
Smart City
- Smart driving, parking, buildings, environment
Waste management
- Open exchange – e.g. mobility services (1st and 3rd party)
Energy
- Smart grid and utilities infrastructure (energy distribution, smart home, smart meters, smart water, etc.)
- National Security
Law enforcement, surveillance, police/interior security data exchange
- Defense and military (border control, intelligent solider)
Cybersecurity for situational awareness and threat intelligence
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
Enterprise integration is more challenging than ever before. The IT evolution requires the integration of more and more technologies. Applications are deployed across the edge, hybrid, and multi-cloud architectures. Traditional middleware such as MQ, ETL, ESB does not scale well enough or only processes data in batch instead of real-time.
This presentation explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS (Integration Platform as a Service) solutions, and why event streaming is a new software category.
A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively cloud-native iPaaS.
Video Recording of this presentation:
https://www.youtube.com/watch?v=I8yZwKg_IJc&t=2842s
Blog post about this topic:
https://www.kai-waehner.de/blog/2021/11/03/apache-kafka-cloud-native-ipaas-versus-mq-etl-esb-middleware/
Amazon Web Services (AWS) provides many options for running hyperscale SAP solutions in the cloud. Additionally, the majority of SAP on-premise products can be run on AWS, including all major RDBMS platforms and Windows/Linux OS platforms. Join this webinar to discover more about these capabilities and services, and how you can use them to deploy your SAP estate on AWS. You can also learn the benefits of running your SAP workloads on AWS and how our customers have leveraged this to achieve real-world business benefits.
Learning Objectives:
- Overview of SAP solutions supported on AWS
- Overview of core AWS services for SAP workloads
- Benefits of deploying SAP on AWS
- Examples of how customers are running SAP workloads on AWS
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology.
Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way integrating with various legacy and modern data sources and sinks.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
• The Automotive Industry (and it’s not only Connected Cars)
• Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
• Smart Cities (including citizen health services, communication infrastructure, …)
All these industries and sectors do not have new characteristics and requirements. They require data integration, data correlation or real decoupling, just to name a few, but are now facing massively increased volumes of data.
Real-time messaging solutions have existed for many years. Hundreds of platforms exist for data integration (including ETL and ESB tooling or specific IIoT platforms). Proprietary monoliths monitor plants, telco networks, and other infrastructures for decades in real-time. But now, Kafka combines all the above characteristics in an open, scalable, and flexible infrastructure to operate mission-critical workloads at scale in real-time. And is taking over the world of connecting data.
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...Kai Wähner
The Internet of Things (IoT) is getting more and more traction as valuable use cases come to light. Whether you are in Healthcare, Telecommunications, Manufacturing, Banking or Retail to name a few industries, there is one key challenge and that's the integration of backend IoT data logs and applications, business services and cloud services to process the data in real time and at scale.
In this talk, we will be sharing how Kafka has become the leading technology used throughout the business to provide Real Time Event Streaming. Explore real life use cases of Kafka Connect, Kafka Streams and KSQL independent of the data deployment be it on a private or public Cloud, On Premise or at the Edge.
Audi - Connected car infrastructure
Robert Bosch Power Tools - Track and Trace of devices and people at construction areas
Deutsche Bahn - Customer 360 for train timetable updates
E.ON - IoT Streaming Platform to integrate and build smart home, smart building and smart grid infrastructures
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
Spoilt for Choice – Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka:
Apache Kafka is a de facto standard streaming data processing platform. It is widely deployed as event streaming platform. Part of Kafka is its stream processing API “Kafka Streams”. In addition, the Kafka ecosystem now offers KSQL, a declarative, SQL-like stream processing language that lets you define powerful stream-processing applications easily. What once took some moderately sophisticated Java code can now be done at the command line with a familiar and eminently approachable syntax.
This session discusses and demos the pros and cons of Kafka Streams and KSQL to understand when to use which stream processing alternative for continuous stream processing natively on Apache Kafka infrastructures. The end of the session compares the trade-offs of Kafka Streams and KSQL to separate stream processing frameworks such as Apache Flink or Spark Streaming.
Organizations need to perform increasingly complex analysis on data — streaming analytics, ad-hoc querying, and predictive analytics — in order to get better customer insights and actionable business intelligence. Apache Spark has recently emerged as the framework of choice to address many of these challenges. In this session, we show you how to use Apache Spark on AWS to implement and scale common big data use cases such as real-time data processing, interactive data science, predictive analytics, and more. We will talk about common architectures, best practices to quickly create Spark clusters using Amazon EMR, and ways to integrate Spark with other big data services in AWS.
Learning Objectives:
• Learn why Spark is great for ad-hoc interactive analysis and real-time stream processing.
• How to deploy and tune scalable clusters running Spark on Amazon EMR.
• How to use EMR File System (EMRFS) with Spark to query data directly in Amazon S3.
• Common architectures to leverage Spark with Amazon DynamoDB, Amazon Redshift, Amazon Kinesis, and more.
OpenText Archive Center 16.2 Single File Vendor Interface (VI) using Microsoft Azure Storage Account as a storage device is now supported on Linux. Checkout this brief overview of its usage on one of our current projects. Thanks to Manish Shah (Microsoft) for his contribution and working with OpenText to achieve support on Linux, to Supriya Pande for her article on the Microsoft Azure Storage Explorer, to Oleh Khrypko (SAP) for his input to handling disaster recovery on OpenText Archive Center and Gary Jackson (Aliter Consulting) for the article.
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
Agenda:
1) Defence, Modern Warfare, and Cybersecurity in 202X
2) Data in Motion with Apache Kafka as Defence Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics and AI / Machine Learning
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
Technologies discussed in the presentation include Apache Kafka, Kafka Streams, kqlDB, Kafka Connect, Elasticsearch, Splunk, IBM QRadar, Zeek, Netflow, PCAP, TensorFlow, AWS, Azure, GCP, Sigma, Confluent Cloud,
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...HostedbyConfluent
Legacy migration is a journey. Mainframes cannot be replaced in a single project. A big bang will fail. This has to be planned long-term.
Mainframe offloading and replacement with Apache Kafka and its ecosystem can be used to keep a more modern data store in real-time sync with the mainframe, while at the same time persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.
This session walks through the different steps some companies are already gone through. Technical options like Change Data Capture (CDC), MQ, and third-party tools for mainframe integration, offloading and replacement are explored.
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
The Rise of Event Streaming in Financial Services - Use Cases, Architectures and Examples powered by Apache Kafka.
The New FinServ Enterprise Reality: Every company is a software company. Innovate OR be Disrupted. Learn how Event Streaming with Apache Kafka and its ecosystem help...
More details:
https://www.kai-waehner.de/apache-kafka-financial-services-industry-banking-finserv-payment-fraud-middleware-messaging-transactions
https://www.kai-waehner.de/blog/2020/04/15/apache-kafka-machine-learning-banking-finance-industry/
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
Vous apprendrez également à :
• Créer plus rapidement des produits et fonctionnalités à l’aide d’une suite complète de connecteurs et d’outils de gestion des flux, et à connecter vos environnements à des pipelines de données
• Protéger vos données et charges de travail les plus critiques grâce à des garanties intégrées en matière de sécurité, de gouvernance et de résilience
• Déployer Kafka à grande échelle en quelques minutes tout en réduisant les coûts et la charge opérationnelle associés
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:
Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.
Basics of Kafka and IBM Cloud Event Streams. Includes all the major topics of Kafka, like Brokers, Clusters, Topics, Partitions, Producers, Consumers, Streams, and Connectors. What Event Stream offers more than just Kafka. Some difference between Kafka and IBM MQ.
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
The Rise of Data in Motion powered by Event Streaming - Use Cases and Architecture for IBM Cloud Pak with Confluent Platform. Including screenshots of the live demo (integration between IBM and Kafka via Confluent Platform and Kafka Connect connectors).
Learn about the integration capabilities of IBM Cloud Pak for Integration, now with the industry’s leading event streaming platform from Confluent Platform powered by Apache Kafka.
Moving from an on-premises environment into AWS is just the start of the journey towards cost optimisation. In this session we’ll look at a range of ways in which our customers can understand their costs and increase their return-on-investment: building the business case; selecting the right models for the right workloads; benefiting from tiered pricing aggregation; using data to drive the choice of AWS services; implementation of intelligent auto-scaling; and, where appropriate, re-platforming to make use of new architectural patterns such as Serverless.
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
The Rise of Data in Motion in the Public Sector powered by event streaming with Apache Kafka.
Citizen Services:
- Health services, e.g. hospital modernization, track & trace - Covid distance control
- Public administration - reduce bureaucracy, data democratization across government departments
- eGovernment - Efficient and digital citizen engagement, e.g. personal ID application process
Smart City
- Smart driving, parking, buildings, environment
Waste management
- Open exchange – e.g. mobility services (1st and 3rd party)
Energy
- Smart grid and utilities infrastructure (energy distribution, smart home, smart meters, smart water, etc.)
- National Security
Law enforcement, surveillance, police/interior security data exchange
- Defense and military (border control, intelligent solider)
Cybersecurity for situational awareness and threat intelligence
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
Enterprise integration is more challenging than ever before. The IT evolution requires the integration of more and more technologies. Applications are deployed across the edge, hybrid, and multi-cloud architectures. Traditional middleware such as MQ, ETL, ESB does not scale well enough or only processes data in batch instead of real-time.
This presentation explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS (Integration Platform as a Service) solutions, and why event streaming is a new software category.
A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively cloud-native iPaaS.
Video Recording of this presentation:
https://www.youtube.com/watch?v=I8yZwKg_IJc&t=2842s
Blog post about this topic:
https://www.kai-waehner.de/blog/2021/11/03/apache-kafka-cloud-native-ipaas-versus-mq-etl-esb-middleware/
Amazon Web Services (AWS) provides many options for running hyperscale SAP solutions in the cloud. Additionally, the majority of SAP on-premise products can be run on AWS, including all major RDBMS platforms and Windows/Linux OS platforms. Join this webinar to discover more about these capabilities and services, and how you can use them to deploy your SAP estate on AWS. You can also learn the benefits of running your SAP workloads on AWS and how our customers have leveraged this to achieve real-world business benefits.
Learning Objectives:
- Overview of SAP solutions supported on AWS
- Overview of core AWS services for SAP workloads
- Benefits of deploying SAP on AWS
- Examples of how customers are running SAP workloads on AWS
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology.
Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way integrating with various legacy and modern data sources and sinks.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
• The Automotive Industry (and it’s not only Connected Cars)
• Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
• Smart Cities (including citizen health services, communication infrastructure, …)
All these industries and sectors do not have new characteristics and requirements. They require data integration, data correlation or real decoupling, just to name a few, but are now facing massively increased volumes of data.
Real-time messaging solutions have existed for many years. Hundreds of platforms exist for data integration (including ETL and ESB tooling or specific IIoT platforms). Proprietary monoliths monitor plants, telco networks, and other infrastructures for decades in real-time. But now, Kafka combines all the above characteristics in an open, scalable, and flexible infrastructure to operate mission-critical workloads at scale in real-time. And is taking over the world of connecting data.
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...Kai Wähner
The Internet of Things (IoT) is getting more and more traction as valuable use cases come to light. Whether you are in Healthcare, Telecommunications, Manufacturing, Banking or Retail to name a few industries, there is one key challenge and that's the integration of backend IoT data logs and applications, business services and cloud services to process the data in real time and at scale.
In this talk, we will be sharing how Kafka has become the leading technology used throughout the business to provide Real Time Event Streaming. Explore real life use cases of Kafka Connect, Kafka Streams and KSQL independent of the data deployment be it on a private or public Cloud, On Premise or at the Edge.
Audi - Connected car infrastructure
Robert Bosch Power Tools - Track and Trace of devices and people at construction areas
Deutsche Bahn - Customer 360 for train timetable updates
E.ON - IoT Streaming Platform to integrate and build smart home, smart building and smart grid infrastructures
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
Spoilt for Choice – Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka:
Apache Kafka is a de facto standard streaming data processing platform. It is widely deployed as event streaming platform. Part of Kafka is its stream processing API “Kafka Streams”. In addition, the Kafka ecosystem now offers KSQL, a declarative, SQL-like stream processing language that lets you define powerful stream-processing applications easily. What once took some moderately sophisticated Java code can now be done at the command line with a familiar and eminently approachable syntax.
This session discusses and demos the pros and cons of Kafka Streams and KSQL to understand when to use which stream processing alternative for continuous stream processing natively on Apache Kafka infrastructures. The end of the session compares the trade-offs of Kafka Streams and KSQL to separate stream processing frameworks such as Apache Flink or Spark Streaming.
Organizations need to perform increasingly complex analysis on data — streaming analytics, ad-hoc querying, and predictive analytics — in order to get better customer insights and actionable business intelligence. Apache Spark has recently emerged as the framework of choice to address many of these challenges. In this session, we show you how to use Apache Spark on AWS to implement and scale common big data use cases such as real-time data processing, interactive data science, predictive analytics, and more. We will talk about common architectures, best practices to quickly create Spark clusters using Amazon EMR, and ways to integrate Spark with other big data services in AWS.
Learning Objectives:
• Learn why Spark is great for ad-hoc interactive analysis and real-time stream processing.
• How to deploy and tune scalable clusters running Spark on Amazon EMR.
• How to use EMR File System (EMRFS) with Spark to query data directly in Amazon S3.
• Common architectures to leverage Spark with Amazon DynamoDB, Amazon Redshift, Amazon Kinesis, and more.
OpenText Archive Center 16.2 Single File Vendor Interface (VI) using Microsoft Azure Storage Account as a storage device is now supported on Linux. Checkout this brief overview of its usage on one of our current projects. Thanks to Manish Shah (Microsoft) for his contribution and working with OpenText to achieve support on Linux, to Supriya Pande for her article on the Microsoft Azure Storage Explorer, to Oleh Khrypko (SAP) for his input to handling disaster recovery on OpenText Archive Center and Gary Jackson (Aliter Consulting) for the article.
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
Agenda:
1) Defence, Modern Warfare, and Cybersecurity in 202X
2) Data in Motion with Apache Kafka as Defence Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics and AI / Machine Learning
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
Technologies discussed in the presentation include Apache Kafka, Kafka Streams, kqlDB, Kafka Connect, Elasticsearch, Splunk, IBM QRadar, Zeek, Netflow, PCAP, TensorFlow, AWS, Azure, GCP, Sigma, Confluent Cloud,
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...HostedbyConfluent
Legacy migration is a journey. Mainframes cannot be replaced in a single project. A big bang will fail. This has to be planned long-term.
Mainframe offloading and replacement with Apache Kafka and its ecosystem can be used to keep a more modern data store in real-time sync with the mainframe, while at the same time persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.
This session walks through the different steps some companies are already gone through. Technical options like Change Data Capture (CDC), MQ, and third-party tools for mainframe integration, offloading and replacement are explored.
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
The Rise of Event Streaming in Financial Services - Use Cases, Architectures and Examples powered by Apache Kafka.
The New FinServ Enterprise Reality: Every company is a software company. Innovate OR be Disrupted. Learn how Event Streaming with Apache Kafka and its ecosystem help...
More details:
https://www.kai-waehner.de/apache-kafka-financial-services-industry-banking-finserv-payment-fraud-middleware-messaging-transactions
https://www.kai-waehner.de/blog/2020/04/15/apache-kafka-machine-learning-banking-finance-industry/
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesKai Wähner
Talk at Strate Conference in London: Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures with Confluent:
How do you leverage the flexibility and extreme scale of the public cloud and the Apache Kafka ecosystem to build scalable, mission-critical machine learning infrastructures that span multiple public clouds—or bridge your on-premises data centre to the cloud?
Join Kai Wähner to learn how to use technologies such as TensorFlow with Kafka’s open source ecosystem for machine learning infrastructures. You’ll learn how to build a scalable, mission-critical machine learning infrastructure for data ingestion and processing, model training, deployment, and monitoring.
The discussed architecture includes capabilities like scalable data preprocessing for training and predictions, a combination of different deep learning frameworks, data replication between data centers, intelligent real-time microservices running on Kubernetes, and local deployment of analytic models for offline predictions.
Learn how the public cloud allows extreme scale for building analytic models and how the Apache Kafka open source ecosystem enables building a cloud-independent infrastructure for preprocessing and ingestion of data and inference and monitoring of analytic models in real time
Understand why hybrid architectures and local model deployment are key for success in many scenarios and why you need a flexible machine learning architecture that supports different technologies and frameworks
Supply Chain Optimization with Apache KafkaKai Wähner
Supply Chain optimization leveraging Event Streaming with Apache Kafka. See real-world use cases and architectures from Walmart, BMW, Porsche, and other enterprises to improve the Supply Chain Management (SCM) processes. Automation, robustness, flexibility, real-time, decoupling, data integration, and hybrid deployments...
Video recording: https://youtu.be/dUkgungBmPs
Blog post: https://www.kai-waehner.de/apache-kafka-supply-chain-management-scm-optimization-scor-six-sigma-real-time
Can Apache Kafka Replace a Database? – The 2021 Update | Kai Waehner, ConfluentHostedbyConfluent
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit, and when it is not. The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage, and ksqlDB as an event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration.
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryKai Wähner
Use Cases, Architectures, and Real-World Examples for data in motion and real-time event streaming powered by Apache Kafka across the supply chain and logistics. Case studies and deployments include Baader, Walmart, Migros, Albertsons, Domino's Pizza, Instacart, Grab, Royal Caribbean, and more.
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
Architecture patterns for distributed, hybrid, edge and global Apache Kafka deployments
Multi-cluster and cross-data center deployments of Apache Kafka have become the norm rather than an exception. This session gives an overview of several scenarios that may require multi-cluster solutions and discusses real-world examples with their specific requirements and trade-offs, including disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments and global Kafka.
Key takeaways:
In many scenarios, one Kafka cluster is not enough. Understand different architectures and alternatives for multi-cluster deployments.
Zero data loss and high availability are two key requirements. Understand how to realize this, including trade-offs.
Learn about features and limitations of Kafka for multi cluster deployments
Global Kafka and mission-critical multi-cluster deployments with zero data loss and high availability became the normal, not an exception.
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdKai Wähner
Microservice architectures are not free lunch! Microservices need to be decoupled, flexible, operationally transparent, data aware and elastic. Most material from last years only discusses point-to-point architectures with inflexible and non-scalable technologies like REST / HTTP. This video takes a look at cutting edge technologies like Apache Kafka, Kubernetes, Envoy, Linkerd and Istio to implement a cloud-native service mesh to solve these challenges and bring microservices to the next level of scale, speed and efficiency.
Key takeaways:
- Apache Kafka decouples services, including event streams and request-response
- Kubernetes provides a cloud-native infrastructure for the Kafka ecosystem
- Service Mesh helps with security and observability at ecosystem / organization scale
- Envoy and Istio sit in the layer above Kafka and are orthogonal to the goals Kafka addresses
Blog post: http://www.kai-waehner.de/blog/2019/09/24/cloud-native-apache-kafka-kubernetes-envoy-istio-linkerd-service-mesh
Video recording of this slide deck: https://youtu.be/Us_C4RFOUrA
Apache Kafka in the Transportation and LogisticsKai Wähner
Event Streaming with Apache Kafka in the Transportation and Logistics.
Track & Trace, Real-time Locating System, Customer 360, Open API, and more…
Examples include Swiss Post, SBB, Deutsche Bahn, Hermes, Migros, Here Technologies, Otonomo, Lyft, Uber, Free Now, Lufthansa, Air France, Singapore Airlines, Amadeus Group, and more.
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit and when it is not.
The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage and ksqlDB as event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration.
Key takeaways:
- Kafka can store data forever in a durable and high available manner
- Kafka has different options to query historical data
- Kafka-native add-ons like ksqlDB or Tiered Storage make Kafka more powerful than ever before to store and process data
- Kafka does not provide transactions, but exactly-once semantics
- Kafka is not a replacement for existing databases like MySQL, MongoDB or Elasticsearch
- Kafka and other databases complement each other; the right solution has to be selected for a problem
- Different options are available for bi-directional pull and push-based integration between Kafka and databases to complement each other
Video Recording:
https://youtu.be/7KEkWbwefqQ
Blog post:
https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Kai Wähner
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Understand how event streaming with Kafka and Confluent complements tools and frameworks such as Kong, Mulesoft, Apigee, Envoy, Istio, Linkerd, Software AG, TIBCO Mashery, IBM, Axway, etc.
A Streaming API Data Exchangeprovides streaming replication between business units and companies. API Management with REST/HTTP is not appropriate for streaming data.
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
Hybrid cloud architectures are the new black for most companies. A cloud-first strategy is evident for many new enterprise architectures, but some use cases require resiliency across edge sites and multiple cloud regions. Data streaming with the Apache Kafka ecosystem is a perfect technology for building resilient and hybrid real-time applications at any scale. This talk explores different architectures and their trade-offs for transactional and analytical workloads. Real-world examples include financial services, retail, and the automotive industry.
Video recording:
https://qconlondon.com/london2022/presentation/resilient-real-time-data-streaming-across-the-edge-and-hybrid-cloud
Keine Angst vorm Dinosaurier: Mainframe-Integration und -Offloading mit Confl...Precisely
Mainframes sind immer noch weit verbreitet im Einsatz und verarbeiten täglich über 70 Prozent der wichtigsten Rechentransaktionen der Welt. Sehr hohe Kosten, monolithische Architekturen und fehlende Experten sind die größten Herausforderungen für Mainframe-Anwendungen. Es ist an der Zeit, innovativer zu werden, auch mit dem Mainframe! Stellen wir uns gemeinsam dem Dinosaurier!
Mainframe Offloading mit Confluent, Apache Kafka und dem zugehörigen Ökosystem kann genutzt werden, um moderne Dateninfrastrukturen in Echtzeit mit dem Mainframe synchron zu halten. Dabei ermöglich Kafka sowohl die Datenverarbeitung als auch die Integration mit Systemen wie Data Warehouses und Analytics-Plattformen. Dabei können via Change Data Capture (CDC) permanent Mainframe-Änderungen im hochvoluminösen Bereich nach Kafka gepusht werden.
In dieser on-demand-präsentation zeigen Confluent und Precisely, wie Unternehmen diesen Schritt zur Legacy-Migration machen, Kosten sparen, eine skalierbare und offene Architektur schaffen und so neue Dienste und Anwendungen ermöglichen.
Event streaming: A paradigm shift in enterprise software architectureSina Sojoodi
This talk helps developers and architects understand the benefits, opportunities and challenges in moving from traditional point-to-point integration in application architecture to one with event streaming. Apache Kafka and Spring provide a solid foundation for enterprise and large organizations to implement event streaming solutions. Examples and common patterns are covered
towards the end.
Many thanks to James Watters and all the original content authors, editors and aggregators referenced in the slides.
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...confluent
Apache Kafka is an open source event streaming platform. It is often used to complement or even replace existing middleware to integrate applications and build microservice architectures. Apache Kafka is already used in various projects in almost every bigger company today. Understood, battled-tested, highly scalable, reliable, real-time.
Blockchain is a different story. This technology is a lot in the news, especially related to cryptocurrencies like Bitcoin. But what is the added value for software architectures? Is blockchain just hype and adds complexity? Or will it be used by everybody in the future, like a web browser or mobile app today? And how is it related to an integration architecture and event streaming platform?
This session explores use cases for blockchains and discusses different alternatives such as Hyperledger, Ethereum and a Kafka-native tamper-proof blockchain implementation. Different architectures are discussed to understand when blockchain really adds value and how it can be combined with the Apache Kafka ecosystem to integrate blockchain with the rest of the enterprise architecture to build a highly scalable and reliable event streaming infrastructure.
Speakers:
Kai Waehner, Technology Evangelist, Confluent
Stephen Reed, CTO, Co-Founder, AiB
App modernization on AWS with Apache Kafka and Confluent CloudKai Wähner
Presentation from AWS ReInvent 2020.
Learn how you can accelerate application modernization and benefit from the open-source Apache Kafka ecosystem by connecting your legacy, on-premises systems to the cloud. In this session, hear real customer stories about timely insights gained from event-driven applications built on an event streaming platform from Confluent Cloud running on AWS, which stores and processes historical data and real-time data streams. Confluent makes Apache Kafka enterprise-ready using infinite Kafka storage with Amazon S3 and multiple private networking options including AWS PrivateLink, along with self-managed encryption keys for storage volume encryption with AWS Key Management Service (AWS KMS).
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner
Don’t underestimate the Hidden Technical Debt in Machine Learning Systems.
Leverage Apache Kafka’s open ecosystem as a scalable and flexible Event Streaming Platform to build one pipeline for real-time and batch use cases.
Use Streaming Machine Learning with Apache Kafka, Tiered Storage, and TensorFlow IO to simplify your big data architecture.
Tiered Storage for Kafka provides:
- one platform for all data processing
- an event-based source of truth for materialized views
- no need for a pipeline between Kafka and a Data Lake like Hadoop
Benefits:
- cost reduction
- long-term backup
- performance isolation (real-time and historical analysis in the same cluster)
Use Cases for Reprocessing Historical Events:
- New consumer application
- Error-handling
- Compliance / regulatory processing
- Query and analyze existing events
- Model training
Similar to Mainframe Integration, Offloading and Replacement with Apache Kafka (20)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
Decentralized finance with crypto and NFTs is a huge topic these days. It becomes a powerful combination with the coming metaverse platforms across industries. This session explores the relationship between crypto technologies and modern enterprise architecture.
I discuss how data streaming and Apache Kafka help build innovation and scalable real-time applications of a future metaverse. Let's skip the buzz (and NFT bubble) and instead review existing real-world deployments in the crypto and blockchain world powered by Kafka and its ecosystem.
Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job?
This session explores the DOs and DONTs. Separate sections explain when to use Kafka, when NOT to use Kafka, and when to MAYBE use Kafka.
No matter if you think about open source Apache Kafka, a cloud service like Confluent Cloud, or another technology using the Kafka protocol like Redpanda or Pulsar, check out this slide deck.
A detailed article about this topic:
https://www.kai-waehner.de/blog/2022/01/04/when-not-to-use-apache-kafka/
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
Live commerce combines instant purchasing of a featured product and audience participation.
This talk explores the need for real-time data streaming with Apache Kafka between applications to enable live commerce across online stores and brick & mortar stores across regions, countries, and continents in any retail business.
The discussion covers several building blocks of a live commerce enterprise architecture, including transactional data processing, omnichannel, natural language processing, augmented reality, edge computing, and more.
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
If there were a buzzword of the hour, it would certainly be "data mesh"! This new architectural paradigm unlocks analytic data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios.
As such, the data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a data mesh infrastructure must be real-time, decoupled, reliable, and scalable.
This presentation explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and - complemented by many other data platforms like a data warehouse, data lake, and lakehouse - solve real business problems.
There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job.
A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems.
Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a presentation.
The slides cover technologies such as Apache Kafka, Apache Spark, Confluent, Databricks, Snowflake, Elasticsearch, AWS Redshift, GCP with Google Bigquery, and Azure Synapse.
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Apache Kafka and Apache Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse.
Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.
Real-World Deployments of Data Streaming with Apache Kafka across the Healthcare Value Chain using open source and cloud-native technologies and serverless SaaS:
1) Legacy Modernization and Hybrid Cloud: Optum (UnitedHealth Group, Centene, Bayer)
2) Streaming ETL (Bayer, Babylon Health)
3) Real-time Analytics (Cerner, Celmatix, CDC/Centers for Disease Control and Prevention)
4) Machine Learning and Data Science (Recursion, Humana)
5) Open API and Omnichannel (Care.com, Invitae)
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
Not all workloads allow cloud computing. Low latency, cybersecurity, and cost-efficiency require a suitable combination of edge computing and cloud integration.
This session explores architectures and design patterns for software and hardware considerations to deploy hybrid data streaming with Apache Kafka anywhere. A live demo shows data synchronization from the edge to the public cloud across continents with Kafka on Hivecell and Confluent Cloud.
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
The manufacturing industry is moving away from just selling machinery, devices, and other hardware. Software and services increase revenue and margins. Equipment-as-a-Service (EaaS) even outsources the maintenance to the vendor.
This paradigm shift is only possible with reliable and scalable real-time data processing leveraging an event streaming platform such as Apache Kafka. This talk explores how Kafka-native Condition Monitoring and Predictive Maintenance help with this innovation.
More details:
https://www.kai-waehner.de/blog/2021/10/25/apache-kafka-condition-monitoring-predictive-maintenance-industrial-iot-digital-twin/
Video recording:
https://youtu.be/tfOuN5KeI9w
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
Today, in 2022, Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments.
This presentation explores the automotive event streaming landscape, including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.
Afterwards, many real-world examples are shown from companies such as Audi, BMW, Porsche, Tesla, Uber, Grab, and FREENOW.
More detail in the blog post:
https://www.kai-waehner.de/blog/2022/01/12/apache-kafka-landscape-for-automotive-and-manufacturing/
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
Technical thought leadership presentation to discuss how leading organizations move to real-time architecture to support business growth and enhance customer experience. This is a forum to discuss use cases with your peers to understand how other digital-native companies are utilizing data in motion to drive competitive advantage.
Agenda:
- Data in Motion with Event Streaming and Apache Kafka
- Streaming ETL Pipelines
- IT Modernisation and Hybrid Multi-Cloud
- Customer Experience and Customer 360
- IoT and Big Data Processing
- Machine Learning and Analytics
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
The Era of Telco 4.0: Embracing Digital Transformation with Data in Motion. Learn about Payment and FinServ Integration for Data in Motion with 5G and Apache Kafka.
1) The rise of Telco 4.0 and the future forward
2) Data in Motion in the Telco industry
3) Real-world Fintech and Payment examples powered by Data in Motion
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
Data in Motion powered by the Apache Kafka ecosystem for Situational Awareness, Threat Detection, Forensics, Zero Trust Zones and Air-Gapped Environments.
Agenda:
1) Cybersecurity in 202X
2) Data in Motion as Cybersecurity Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
More details in the "Kafka for Cybersecurity" blog series:
https://www.kai-waehner.de/blog/2021/07/02/kafka-cybersecurity-siem-soar-part-1-of-6-data-in-motion-as-backbone/
Apache Kafka and MQTT - Overview, Comparison, Use Cases, ArchitecturesKai Wähner
Apache Kafka and MQTT are a perfect combination for many IoT use cases. This presentation covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions.
Blog series with more details here:
https://www.kai-waehner.de/blog/2021/03/15/apache-kafka-mqtt-sparkplug-iot-blog-series-part-1-of-5-overview-comparison/
Connected Vehicles and V2X with Apache KafkaKai Wähner
This session discusses uses cases leveraging Apache Kafka open source ecosystem as streaming platform to process IoT data.
See use cases, architectural alternatives and a live demo of how devices connect to Kafka via MQTT. Learn how to analyze the IoT data either natively on Kafka with Kafka Streams/KSQL, or on an external big data cluster like Spark, Flink or Elastic leveraging Kafka Connect, and how to leverage TensorFlow for Machine Learning.
The focus is on connected cars / connected vehicles and V2X use cases respectively mobility services.
A live demo shows how to build a cloud-native IoT infrastructure on Kubernetes to connect and process streaming data in real-time from 100.000 cars to do predictive maintenance at scale in real-time.
Code for the live demo on Github:
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Mainframe Integration, Offloading and Replacement with Apache Kafka
1. Mainframe Integration, Offloading
and Replacement with Apache Kafka
Stand up to the Dinosaur!
Kai Waehner
Technology Evangelist
contact@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
2. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
3. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
4. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
The Mainframe is here to stay!
“Mainframes are still hard at work,
processing over 70 percent of the world’s
most important computing transactions
every day. Organizations like banks, credit
card companies, airlines, medical facilities,
insurance companies, and others that can
absolutely not afford downtime and errors
depend on the mainframe to get the job
done. Nearly three-quarters of all Fortune
500 companies still turn to the mainframe to
get the critical processing work completed”
https://www.bmc.com/blogs/mainframe-mips-an-introduction/
5. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
What is a Mainframe?
Modern mainframe design is characterized less by raw
computational speed and more by:
• High reliability and security
• Extensive input-output ("I/O") facilities with the ability to
offload to separate engines
• Strict backward compatibility with older software
• High hardware and computational utilization rates through
virtualization to support massive throughput
• Hot-swapping of hardware, such as processors and memory
Vendors: “IBM and the Seven Dwarfs”
The IBM z15, announced in 2019,
with up to 40TB RAM and 190 Cores,
typically costs millions $$$
(variable software costs not included)
6. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
COBOL (COmmon Business Oriented Language)
6
• Available since the 1950s. Latest update: 2014.
• Compiled English-like computer programming language designed for business use
• From column-based punch-card-image format to object-oriented programming
• Concerns: Lack of structure, compatibility issues, verbose syntax, isolation from computer science community
7. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Neobanks and FinTechs Hunting the Traditional Banks
Monolithic
Proprietary
Complex
Inflexible
8. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
How long does it take do deploy a new feature
to the mainframe in your organization?
8
“N26 is a relatively young company, so we
take advantage of a modern tech stack…
We typically practice continuous
deployment, so every merge goes
through rigorous automated testing and
gets deployed to live”
https://www.infoq.com/news/2020/01/scaling-infrastructure-code/
9. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Not just the FinTechs Modernize their Architecture!
9
https://www.confluent.io/kafka-summit-london18/distributing-computing-key-player-in-corebanking-platforms/
Sberbank re-implemented their Core Banking Platform
around Kafka and Event Streaming to be ready for the future!
10. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
The mainframe supports
up-to-date technologies such as
DB2, MQ, WebSphere, Java, Linux,
Web Services, Kubernetes, Ansible!
https://www.zazzle.com.au/cobol+dinosaur+gifts
11. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Deploy Kafka and its Ecosystem on the IBM z15 Mainframe…
12. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
MIPS (million instructions per second)
to normalize CPU usage across CPU types and models or hardware configs
MSU (million service units)
hardware and software metrics calculated directly by the operating system
12
… and what about
hiring mainframe
experts?
13. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Huge demand to build an open, flexible, scalable platform
• Real time
• Scalability
• High availability
• Decoupling
• Cost reduction
• Flexibility
• Standards-based
• Extensibility
• Security
• Infrastructure-independent
• Multi-region / global
14. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
15. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Event Streaming in the Finance Industry
Check past Kafka Summit videos for details about the use cases:
https://kafka-summit.org/past-events/
16. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Event Streaming for
Traditional and New Innovative Use Cases
in the Finance Industry
Real Time Processing Digital Transformation Strategic Goals
Short-Sale Risk
Calculation / Trade
Approval
Mainframe Offloading
and Replacement
Credit Card
Fraud Detection
Next-Best
Offer
Robot Process Automation
(e.g. Know Your Customer,
KYC)
Customer Service
(e.g. Chat Bots)
IT
ModernizationRegulatory
Reporting
Account Login
Fraud Detection
Anomaly Detection
Across Assets and Locations
Derivatives
Pricing Compliance
Trading Post-
Processing
Strategic
Planning and
Simulations
17. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
https://www.confluent.io/customers/rbc/ “… rescue data off of the mainframe, in a cloud native,
microservice-based fashion … [to] … significantly reduce the
reads on the mainframe, saving RBC fixed infrastructure
costs (OPEX). RBC stayed compliant with bank regulations
and business logic, and is now able to create new applications
using the same event-based architecture.”
18. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
19. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Event Streaming Platform –
The Commit Log
Time
P
C1 C2
C3
20. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Event Streaming Platform –
A Distributed System for 24/7 and Zero Data Loss
Broker 1
Topic1
partition1
Broker 2 Broker 3 Broker 4
Topic1
partition1
Topic1
partition1
Leader Follower
Topic1
partition2
Topic1
partition2
Topic1
partition2
Topic1
partition3
Topic1
partition4
Topic1
partition3
Topic1
partition3
Topic1
partition4
Topic1
partition4
21. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
An Event Streaming Platform
is the Underpinning of an Event-driven Architecture
Microservices
Mainframes
SaaS apps
Mobile
Customer 360
Real-time fraud
detection
Data warehouse
Producers
Consumers
Database
change
Microservices
events
SaaS
data
Customer
experience
s
Streams of real time events
Stream processing
apps
Connectors
Connectors
Stream processing
apps
22. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Apache Kafka at Scale at Tech Giants
> 7 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka Is not just used by tech giants
** Kafka is not just used for big data
23. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Kafka Connect
Kafka Cluster
CRM
Integration
Domain-Driven Design and Decoupled Applications
Legacy
Integration
Custom
Application
Mainframe
Connector
Java / C++ /
Go / Python /
KSQL
Schema
Registry
Event Streaming Platform
CRM Domain Legacy Payment Domain Fraud Domain
Audit Logs,
RBAC, etc.
24. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Mission-Critical
How to
deploy this 24/7,
including
Disaster Recovery?
25. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Disaster Recovery – RPO and RTO
RPO = Recovery Point Objective
RTO = Recovery Time Objective
26. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
IBM GDPS
● Synchronous and asynchronous mirroring
● Different RTO / RPO setups
● Built for resiliency and disaster recovery
● Supports multiple sites
● Uses concepts like XCF (Cross-system Coupling
Facility), Parallel Sysplex, Disk Mirroring,
HyperSwap, etc.
● Independent of transaction manager (e.g. CICS,
IMS) or database manager (e.g. DB2 , IMS, VSAM)
https://www.ibm.com/it-infrastructure/z/technologies/gdps
https://ibmsystemsmag.com/IBM-Z/07/2019/resiliency-gdps-solutions
27. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Kafka Clusters
can Stretch over Regions
Zero Downtime + Zero Data loss
(RPO=0 and RTO=0)
e.g. Stretched over US East + Mid + West
Automate Disaster Recovery
Sync or Async Replication per Topic
Offset Preserving
Automated Client Failover without
Custom Code
Multi-Region Cluster
(Only available in Confluent Platform)
28. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Example of a Multi-Region Cluster in a Bank
Large FinServ Customer
Payment
Log
Payment
Log
Location Location
synchronous
asynchronous
● ‘Payment’ transactions enter
from us-east and us-west with
fully synchronous replication
● ‘Log’ and ‘Location’ information
in the same cluster use async -
optimized for latency
● Automated disaster recovery
(zero downtime, zero data loss)
Result: Clearing time from ‘deposit’ to
‘available’ goes from 5 days to 5
seconds (including security checks)
(Only available in Confluent Platform)
29. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Global Event Streaming
Aggregate Small Footprint
Edge Deployments with
Replication (Aggregation)
Simplify Disaster Recovery
Operations with
Multi-Region Clusters
with RPO=0 and RTO=0
Stream Data Globally with
Replication and Cluster Linking
30. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
31. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
What to do to stay competitive and be innovative?
Big Bang
Replacements
Usually Fail!
32. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Mainframe
Offloading
Journey from Mainframe
to Hybrid* and Cloud
PHASE
3
Hybrid
Replication
Mainframe
Replacement
PHASE
2
PHASE
1 * with or without the mainframe
33. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Mainframe Offloading
Database
change
Microservices
events
SaaS
data
Customer
experiences
Streams of real time events
Legacy
App
Modern
App 1
Complex business logic
Push changes once
Write
Write
continuously
Read
continuously
Modern
App 2
Write
continuously
Read
continuously
MIPS / MSU
MIPS / MSU
MIPS / MSU
Read
No MIPS / MSU
34. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Mainframe Replacement
Database
change
Microservices
events
SaaS
data
Customer
experiences
Streams of real time events
Legacy
App
Modern
App 1
Complex business logic
Push changes once
Write
Write
continuously
Read
continuously
Modern
App 2
Write
continuously
Read
continuously
MIPS / MSU
MIPS / MSU
MIPS / MSU
Read
No MIPS / MSU
35. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Mainframe Replacement (without Writing New Apps)
https://learnworthy.net/could-java-be-the-next-cobol/
https://medium.com/@FranzRoses/the-enterprise-journey-to-decompose-the-cobol-banking-core-into-java-the-developer-perspective-2e8a53bb528e
https://bs2manuals.ts.fujitsu.com/psBEANCONNECTV65en/beanconnect-user-guide-user-guide-13821/cobol2java-bc-ug-499/mapping-cobol-data-types-to-java-classes-bc-ug-500
What about
the 5% of
(complex) code
that cannot
be migrated
automatically?
Showstopper?
36. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Orders Customers
Payments
Stock
WebSockets / SSE
JMS
ESB
REST
Java
Connect
RPC
MQ
Integration between Kafka and Mainframes
MQ
C++ File
???
37. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Change Data Capture (CDC)
Transaction log-based CDC pushes data changes (insert, update, delete) from the mainframe
to Kafka in real time.
+ Real time push updates to Kafka
+ Eliminate disruptive full loads, i.e. minimize production impact
+ Reduce MIPS consumption
+ Integrate with any mainframe technology
(DB2 z/OS, VSAM, IMS/DB, CICS, etc.)
+ Full support
- High licensing costs
38. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Integration Options between Kafka and Mainframe
• IBM InfoSphere Data Replication (IIDR) Change Data Capture (CDC) solution for Mainframe
• 3rd Party commercial CDC solution (e.g. Attunity, DBS-H, B.O.S. Software, …)
• Open-source CDC solution (e.g. Debezium) – but you still need an IIDR license, this is the same
challenge as with Oracle and GoldenGate CDC)
• DB2 SQL Integration: Create interface tables + Kafka Connect + JDBC connector
• IBM MQ interface + Kafka Connect’s IBM MQ connector
• VSAM File Integration (e.g. B.O.S, Luminex, Syncsort, Qlik Replicate, …)
• Confluent REST Proxy and HTTP(S) calls from the mainframe
• Kafka Client APIs on the mainframe
Evaluate the performance / scalability / feature set / cost
of the tools and the footprint on the mainframe!
Don’t underestimate politics…
39. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
1. Mainframe Status Quo and Challenges
2. Use Cases for Event Streaming
3. Apache Kafka as Cloud-native and Hybrid
Infrastructure
4. Mainframe Integration, Offloading and
Replacement
5. Case Study
Agend
40. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Cloud
Adoption
Journey from Mainframe
to Hybrid and Cloud
PHASE
3
Hybrid
Cloud
Cloud-First
Development
PHASE
2
PHASE
1
https://www.accenture.com/_acnmedia/pdf-70/accenture-moving-to-the-cloud-strategy-for-banks-in-north-america.pdf
Case Study - Bank CEO
“This is the last 5-year $20M IBM contract.
Get rid off the mainframe!”
41. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Strangler Design Pattern
https://paulhammant.com/2013/07/14/legacy-application-strangulation-case-studies/
https://martinfowler.com/bliki/StranglerFigApplication.html
“The most important
reason to consider a
strangler fig application
over a cut-over rewrite
is reduced risk.”
Martin Fowler
42. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Year 0: Direct Communication between Mainframe and App
Application
1) Direct Legacy Mainframe Communication to App
Date Amount
1/27/2017 $4.56
1/22/2017 $32.14
Core Banking ‘1970’
(Mainframe)
43. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Year 1: Kafka for Decoupling between Mainframe and App
Application
1) Direct Legacy Mainframe Communication to App
2) Kafka for Decoupling between Mainframe and App
Date Amount
1/27/2017 $4.56
1/22/2017 $32.14
Core Banking ‘1970’
(Mainframe)
Mainframe Integration
- Change Data Capture (IIDR)
- Kafka Connect (JMS, MQ, JDBC)
- REST Proxy
- Kafka Client
- 3rd Party CDC Tool
44. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Year 2 to 4: New Projects and Applications
Application
Microservices
Agile, Lightweight
(but Scalable, Robust)
Applications
Big Data Project
(Elastic, Spark,
AWS Services, …)
1) Direct Legacy Mainframe Communication to App
2) Kafka for Decoupling between Mainframe and App
3) New Projects and Applications
External
Solution
Date Amount
1/27/2017 $4.56
1/22/2017 $32.14
Core Banking ‘1970’
(Mainframe)
Mainframe Integration
- Change Data Capture (IIDR)
- Kafka Connect (JMS, MQ, JDBC)
- REST Proxy
- Kafka Client
- 3rd Party CDC Tool
45. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Year 5: Mainframe Replacement
Application
Microservices
Agile, Lightweight
(but Scalable, Robust)
Applications
Big Data Project
(Elastic, Spark,
AWS Services, …)
1) Direct Legacy Mainframe Communication to App
2) Kafka for Decoupling between Mainframe and App
3) New Projects and Applications
4) Mainframe Replacement
External
Solution
Core Banking ‘2020’
(Modern Technology)
Date Amount
1/27/2017 $4.56
1/22/2017 $32.14
46. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
What about Transactions?
48
47. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
IBM Mainframe Database and Transaction Managers
49
IMS
• Hierarchical Database
• Transaction Manager
• Supports Cobol, Assembler, PL/1, Java
• IMS Connect for Integration with WebSphere MQ, SOAP, …
DB2
• Relational Database
CICS
• Transaction Manager
• Database “Lite” (VSAM Datasets)
• Integration and Application Programming Capabilities similar to IMS,
but much easier to use
• Advanced Features like Transaction Prioritization
The Heart
of your
Business
App
48. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
“Transactions” in Apache Kafka
50
Exactly-Once Semantics (EOS)
available since Kafka 0.11:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging
https://www.confluent.io/kafka-summit-london18/dont-repeat-yourself-introducing-exactly-once-semantics-in-apache-kafka/
49. Mainframe Offloading and Replacement with Apache Kafka – @KaiWaehner - www.kai-waehner.de
Bi-Directional Integration with Referential Integrity
51
Java
App
Python
App
Mainframe
Transactions
Bi-Directional Integration
Secured Referential Integrity
End-to-End “Transactions”
Low Latency
Database
change
Microservices
events
SaaS
data
Customer
experiences
Streams of real time events
Kafka
Exactly-Once
Semantics
using librdkafkaIMS
DB
Cobol
App
Scope of the Middleware