Real-World Deployments of Data Streaming with Apache Kafka across the Healthcare Value Chain using open source and cloud-native technologies and serverless SaaS:
1) Legacy Modernization and Hybrid Cloud: Optum (UnitedHealth Group, Centene, Bayer)
2) Streaming ETL (Bayer, Babylon Health)
3) Real-time Analytics (Cerner, Celmatix, CDC/Centers for Disease Control and Prevention)
4) Machine Learning and Data Science (Recursion, Humana)
5) Open API and Omnichannel (Care.com, Invitae)
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems.
Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a presentation.
The slides cover technologies such as Apache Kafka, Apache Spark, Confluent, Databricks, Snowflake, Elasticsearch, AWS Redshift, GCP with Google Bigquery, and Azure Synapse.
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...DataWorks Summit
The global financial crisis showed that traditional IT systems at banks were ill equiped to monitor and manage the daily-changing risk landscape during the global financial crisis. The sheer amount of data that needed to be crunched meant that many of the banks were day(s) behind in calculating, understanding and reporting their risk positions. Post crisis, a review by banking regulator, led the regulators to introduce a new legislation BCBS 239: Principles for effective risk data aggregation and reporting, that requires banks to meet more stringent (timeliness) requirement, in their ability to aggregate and report on their quickly-changing risk positions or risk fines to the tune of $millions. To meet these new requirements, banks have been forced to re-think their traditional IT architectures, which are unable to cope with sheer volume of risk data, and are instead turning to Apache Hadoop and Apache Spark to build out next generation of risk systems. In this talk you will discover, how some of the leading banks in the world are leveraging Apache Hadoop and Apache Spark to meet BCBS 239 regulation.
Speaker
Kunal Taneja
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
Agenda:
1) Defence, Modern Warfare, and Cybersecurity in 202X
2) Data in Motion with Apache Kafka as Defence Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics and AI / Machine Learning
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
Technologies discussed in the presentation include Apache Kafka, Kafka Streams, kqlDB, Kafka Connect, Elasticsearch, Splunk, IBM QRadar, Zeek, Netflow, PCAP, TensorFlow, AWS, Azure, GCP, Sigma, Confluent Cloud,
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryKai Wähner
Use Cases, Architectures, and Real-World Examples for data in motion and real-time event streaming powered by Apache Kafka across the supply chain and logistics. Case studies and deployments include Baader, Walmart, Migros, Albertsons, Domino's Pizza, Instacart, Grab, Royal Caribbean, and more.
Your Roadmap for An Enterprise Graph StrategyNeo4j
Speaker: Michael Moore, Ph.D., Executive Director, Knowledge Graphs + AI, EY National Advisory
Abstract: Knowledge graphs have enormous potential for delivering superior customer experiences, advanced analytics and efficient data management.
Learn valuable tips from a leading practitioner on how to position, organize and implement your first enterprise graph project.
Apache Kafka in the Transportation and LogisticsKai Wähner
Event Streaming with Apache Kafka in the Transportation and Logistics.
Track & Trace, Real-time Locating System, Customer 360, Open API, and more…
Examples include Swiss Post, SBB, Deutsche Bahn, Hermes, Migros, Here Technologies, Otonomo, Lyft, Uber, Free Now, Lufthansa, Air France, Singapore Airlines, Amadeus Group, and more.
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
Live commerce combines instant purchasing of a featured product and audience participation.
This talk explores the need for real-time data streaming with Apache Kafka between applications to enable live commerce across online stores and brick & mortar stores across regions, countries, and continents in any retail business.
The discussion covers several building blocks of a live commerce enterprise architecture, including transactional data processing, omnichannel, natural language processing, augmented reality, edge computing, and more.
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems.
Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a presentation.
The slides cover technologies such as Apache Kafka, Apache Spark, Confluent, Databricks, Snowflake, Elasticsearch, AWS Redshift, GCP with Google Bigquery, and Azure Synapse.
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...DataWorks Summit
The global financial crisis showed that traditional IT systems at banks were ill equiped to monitor and manage the daily-changing risk landscape during the global financial crisis. The sheer amount of data that needed to be crunched meant that many of the banks were day(s) behind in calculating, understanding and reporting their risk positions. Post crisis, a review by banking regulator, led the regulators to introduce a new legislation BCBS 239: Principles for effective risk data aggregation and reporting, that requires banks to meet more stringent (timeliness) requirement, in their ability to aggregate and report on their quickly-changing risk positions or risk fines to the tune of $millions. To meet these new requirements, banks have been forced to re-think their traditional IT architectures, which are unable to cope with sheer volume of risk data, and are instead turning to Apache Hadoop and Apache Spark to build out next generation of risk systems. In this talk you will discover, how some of the leading banks in the world are leveraging Apache Hadoop and Apache Spark to meet BCBS 239 regulation.
Speaker
Kunal Taneja
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
Agenda:
1) Defence, Modern Warfare, and Cybersecurity in 202X
2) Data in Motion with Apache Kafka as Defence Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics and AI / Machine Learning
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
Technologies discussed in the presentation include Apache Kafka, Kafka Streams, kqlDB, Kafka Connect, Elasticsearch, Splunk, IBM QRadar, Zeek, Netflow, PCAP, TensorFlow, AWS, Azure, GCP, Sigma, Confluent Cloud,
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryKai Wähner
Use Cases, Architectures, and Real-World Examples for data in motion and real-time event streaming powered by Apache Kafka across the supply chain and logistics. Case studies and deployments include Baader, Walmart, Migros, Albertsons, Domino's Pizza, Instacart, Grab, Royal Caribbean, and more.
Your Roadmap for An Enterprise Graph StrategyNeo4j
Speaker: Michael Moore, Ph.D., Executive Director, Knowledge Graphs + AI, EY National Advisory
Abstract: Knowledge graphs have enormous potential for delivering superior customer experiences, advanced analytics and efficient data management.
Learn valuable tips from a leading practitioner on how to position, organize and implement your first enterprise graph project.
Apache Kafka in the Transportation and LogisticsKai Wähner
Event Streaming with Apache Kafka in the Transportation and Logistics.
Track & Trace, Real-time Locating System, Customer 360, Open API, and more…
Examples include Swiss Post, SBB, Deutsche Bahn, Hermes, Migros, Here Technologies, Otonomo, Lyft, Uber, Free Now, Lufthansa, Air France, Singapore Airlines, Amadeus Group, and more.
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
Live commerce combines instant purchasing of a featured product and audience participation.
This talk explores the need for real-time data streaming with Apache Kafka between applications to enable live commerce across online stores and brick & mortar stores across regions, countries, and continents in any retail business.
The discussion covers several building blocks of a live commerce enterprise architecture, including transactional data processing, omnichannel, natural language processing, augmented reality, edge computing, and more.
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
If there were a buzzword of the hour, it would certainly be "data mesh"! This new architectural paradigm unlocks analytic data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios.
As such, the data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a data mesh infrastructure must be real-time, decoupled, reliable, and scalable.
This presentation explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and - complemented by many other data platforms like a data warehouse, data lake, and lakehouse - solve real business problems.
There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job.
A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
Not all workloads allow cloud computing. Low latency, cybersecurity, and cost-efficiency require a suitable combination of edge computing and cloud integration.
This session explores architectures and design patterns for software and hardware considerations to deploy hybrid data streaming with Apache Kafka anywhere. A live demo shows data synchronization from the edge to the public cloud across continents with Kafka on Hivecell and Confluent Cloud.
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...HostedbyConfluent
Legacy migration is a journey. Mainframes cannot be replaced in a single project. A big bang will fail. This has to be planned long-term.
Mainframe offloading and replacement with Apache Kafka and its ecosystem can be used to keep a more modern data store in real-time sync with the mainframe, while at the same time persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.
This session walks through the different steps some companies are already gone through. Technical options like Change Data Capture (CDC), MQ, and third-party tools for mainframe integration, offloading and replacement are explored.
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Apache Kafka and Apache Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse.
Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.
A Health Catalyst Overview: Learn How a Data First Strategy Can Drive Increas...Health Catalyst
Without the pressure of a one-on-one demo, you can join a crowd of peers to ‘kick the tires’ if you will, as you listen to Jared Crapo—a sought after healthcare strategist—talk about what a data-first strategy is, and the strategic components to a data-first strategy employing a data operating system, a breakthrough engineering approach that combines the features of data warehousing, clinical data repositories, and health information exchanges in a single, common-sense technology platform that turns data into actionable assets used for all types of outcomes improvements.
Lest you worry about too much ‘pie in the sky’ strategy talk with few results to show, Sam Turman, Senior Solution Architect, will provide tangible solution demonstrations that are driving material results. Even if you aren’t in the market for Health Catalyst solutions and services, you will be able to:
Think with more clarity through your approach to overcoming the current market challenges.
Reconsider the strategy you are employing to build cross-organizational awareness and support to put a data-first plan at the center of your plan.
Define action you can take today to assess your gaps, understand your options, and accelerate your progress to drive outcomes improvements.
Join us and you won’t be disappointed. Jared is one of those types of thinkers that many pay big money to listen to and it is our fortune to have 60 minutes with him to think deeply about moving healthcare forward, one patient at a time. We hope you can join us.
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
Building a Real-Time Analytics Application with
Apache Pulsar and Apache Pinot
While the demands for real-time analytics are growing in leaps and bounds, the analytics software must rely on streaming platforms for ingesting high volumes of data that's traveling in lightning speed down the pipeline. We will take a look at 2 powerful open source Apache platforms: Pulsar and Pinot, that work hand-in-hand together to deliver the analytical results which bring great value to your systems.
Presenters: Mary Grygleski - Streaming Developer Advocate &
Mark Needham - Developer Relations Engineer at StarTree
Note: This webinar will be recorded and later posted on our Webinar page (https://altinity.com/webinarspage/) or Altinity official Youtube channel (https://www.youtube.com/@Altinity).
Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job?
This session explores the DOs and DONTs. Separate sections explain when to use Kafka, when NOT to use Kafka, and when to MAYBE use Kafka.
No matter if you think about open source Apache Kafka, a cloud service like Confluent Cloud, or another technology using the Kafka protocol like Redpanda or Pulsar, check out this slide deck.
A detailed article about this topic:
https://www.kai-waehner.de/blog/2022/01/04/when-not-to-use-apache-kafka/
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Developing custom transformation in the Kafka connect to minimize data redund...HostedbyConfluent
Compacted topics grow over time and are often utilizing high performance, low latency and relatively expensive storage solutions. Reducing duplicated data plays a critical role in the size of compacted topics. with less data on the topics, the Kafka cluster consumes less disk space which in turn it leads to lower operation cost.
in this use case-driven talk, we are going to demonstrate how our team at UnitedHealth Group leveraged existing transformers to extract data from the message metadata in the topic as well as how we developed our customized transformers to minimize the amount of duplicated data in each message in the topic.
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
A quick introduction to Apache NiFi and it's ecosystem. Also a hands on demo on using processors, examining provenance, ingesting REST Feeds, XML, Cameras, Files, Running TensorFlow, Running Apache MXNet, integrating with Spark and Kafka. Storing to HDFS, HBase, Phoenix, Hive and S3.
Data modelling is considered a staple in the world of data management. The skill of the data modeler and their knowledge of the business plays a large role in successful Enterprise Information Management across many organizations. Data modeling requires formal accountability, attention to metadata and getting the business heavily involved in data requirement development. These are all traits of solid Data Governance programs.
Join Bob Seiner and a special guest modeler extraordinaire in this month’s installment of Real-World Data Governance to discuss data modeling as a form of data governance. Learn how to use the skillfulness of the data modeler to advance data-as-an-asset and governance agendas while conveying the importance and value of both disciplines.
In this webinar Bob and a special guest will talk about:
•Data Modeling as Art or Science
•Role of Data Modeler in a Governance Program
•Data Modeler Skills as Governance Skills
•Modeling and Governance Best Practices
•Leveraging the Model as a Governance Artifact
The Connected Consumer – Real-time Customer 360Capgemini
With Business Data Lake technologies based on EMC’s Big Data portfolio it becomes possible to move away from channel specific analytics towards a 360 customer view.
This presentation will show how technologies like Spark, Hadoop, and Kafka help companies gain a real-time view of everything their customers do and make changes to customer touch points whether mobile, web, in-store, direct marketing or existing transactional systems.
Presented by Steve Jones, Vice President, Insights & Data, Capgemini at EMC World 2016
http://www.capgemini.com/emc
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
To remain competitive, organizations need to democratize access to fast analytics, not only to gain real-time insights on their business but also to power smart apps that need to react in the moment. In this session, you will learn how Kafka and SingleStore enable modern, yet simple data architecture to analyze both fast paced incoming data as well as large historical datasets. In particular, you will understand why SingleStore is well suited process data streams coming from Kafka.
Data Ingest Self Service and Management using Nifi and KafkaDataWorks Summit
We’re feeling the growing pains of maintaining a large data platform. Last year we went from 50 to 150 unique data feeds by adding them all by hand. In this talk we will share the best practices developed to handle our 300% increase in feeds through self service. Having self-service capabilities will increase your teams velocity and decrease your time to value and insight.
* Self service data feed design and ingest
* configuration management
* automatic debugging
* light weight data governance
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
Decentralized finance with crypto and NFTs is a huge topic these days. It becomes a powerful combination with the coming metaverse platforms across industries. This session explores the relationship between crypto technologies and modern enterprise architecture.
I discuss how data streaming and Apache Kafka help build innovation and scalable real-time applications of a future metaverse. Let's skip the buzz (and NFT bubble) and instead review existing real-world deployments in the crypto and blockchain world powered by Kafka and its ecosystem.
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
The Rise of Data in Motion in the Public Sector powered by event streaming with Apache Kafka.
Citizen Services:
- Health services, e.g. hospital modernization, track & trace - Covid distance control
- Public administration - reduce bureaucracy, data democratization across government departments
- eGovernment - Efficient and digital citizen engagement, e.g. personal ID application process
Smart City
- Smart driving, parking, buildings, environment
Waste management
- Open exchange – e.g. mobility services (1st and 3rd party)
Energy
- Smart grid and utilities infrastructure (energy distribution, smart home, smart meters, smart water, etc.)
- National Security
Law enforcement, surveillance, police/interior security data exchange
- Defense and military (border control, intelligent solider)
Cybersecurity for situational awareness and threat intelligence
The Rise of Data in Motion in the Healthcare Industry - Use Cases, Architectures and Examples powered by Apache Kafka.
Use Cases for Data in Motion in the Healthcare Industry:
- Know Your Patient (= “Customer 360”)
- Operations (Healthcare 4.0 including Drug R&D, Patient Care, etc.)
- IT Perspective (Cybersecurity, Mainframe Offload, Hybrid Cloud, Streaming ETL, etc)
Real-world examples include Covid-19 Electronic Lab Reporting, Cerner, Optum, Centene, Humana, Invitae, Bayer, Celmatix, Care.com.
Machine Learning with Apache Kafka in Pharma and Life SciencesKai Wähner
Blog Post:
https://www.kai-waehner.de/apache-kafka-event-streaming-pharmaceuticals-pharma-life-sciences-use-cases-architecture
Video Recording:
https://youtu.be/t2IH0brwGTg
AI/Machine learning and the Apache Kafka ecosystem are a great combination for training, deploying and monitoring analytic models at scale in real-time. They are showing up more and more in projects but still, feel like buzzwords and hype for science projects.
See how to connect the dots!
--How are Kafka and Machine Learning related?
--How can they be combined to productionize analytic models in mission-critical and scalable real-time applications?
--We will discuss a step-by-step approach to build a scalable and reliable real-time infrastructure for drug discovery doing data integration, feature engineering, image processing, model scoring and processing orchestration.
Use Cases:
R&D Engineering
Sales & Marketing
Manufacturing & Quality Assurance
Supply Chain
Product Monitoring & After Sales Support
VoC (Voice of Customer)
Single View Customer
Yield/Quality Optimization
Improved Drug Yield
Proactive Service Scheduling
Testing & Simulation
Drug Diversion
Process/Quality Monitoring
Inventory & Supply Chain Optimization
Proactive Service Offers
Patent Research and Analytics
Personalized Offers / Ads
EDW Offload
Supply Chain Network Design/Risk Management
Product Predictive Maintenance
Clinical Trials
Customer Segmentation
Smart Products
Serialization & e-Pedigree
Product Usage Tracking
GTM
Global Facilities
Inventory and Logistics Visibility
Warranty & Recall Management
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
If there were a buzzword of the hour, it would certainly be "data mesh"! This new architectural paradigm unlocks analytic data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios.
As such, the data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a data mesh infrastructure must be real-time, decoupled, reliable, and scalable.
This presentation explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and - complemented by many other data platforms like a data warehouse, data lake, and lakehouse - solve real business problems.
There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job.
A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
Not all workloads allow cloud computing. Low latency, cybersecurity, and cost-efficiency require a suitable combination of edge computing and cloud integration.
This session explores architectures and design patterns for software and hardware considerations to deploy hybrid data streaming with Apache Kafka anywhere. A live demo shows data synchronization from the edge to the public cloud across continents with Kafka on Hivecell and Confluent Cloud.
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...HostedbyConfluent
Legacy migration is a journey. Mainframes cannot be replaced in a single project. A big bang will fail. This has to be planned long-term.
Mainframe offloading and replacement with Apache Kafka and its ecosystem can be used to keep a more modern data store in real-time sync with the mainframe, while at the same time persisting the event data on the bus to enable microservices, and deliver the data to other systems such as data warehouses and search indexes.
This session walks through the different steps some companies are already gone through. Technical options like Change Data Capture (CDC), MQ, and third-party tools for mainframe integration, offloading and replacement are explored.
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
Apache Kafka in conjunction with Apache Spark became the de facto standard for processing and analyzing data. Both frameworks are open, flexible, and scalable.
Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use serverless SaaS offerings to focus on business logic. However, hybrid and multi-cloud scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden.
This session explores different architectures to build serverless Apache Kafka and Apache Spark multi-cloud architectures across regions and continents.
We start from the analytics perspective of a data lake and explore its relation to a fully integrated data streaming layer with Kafka to build a modern data Data Lakehouse.
Real-world use cases show the joint value and explore the benefit of the "delta lake" integration.
A Health Catalyst Overview: Learn How a Data First Strategy Can Drive Increas...Health Catalyst
Without the pressure of a one-on-one demo, you can join a crowd of peers to ‘kick the tires’ if you will, as you listen to Jared Crapo—a sought after healthcare strategist—talk about what a data-first strategy is, and the strategic components to a data-first strategy employing a data operating system, a breakthrough engineering approach that combines the features of data warehousing, clinical data repositories, and health information exchanges in a single, common-sense technology platform that turns data into actionable assets used for all types of outcomes improvements.
Lest you worry about too much ‘pie in the sky’ strategy talk with few results to show, Sam Turman, Senior Solution Architect, will provide tangible solution demonstrations that are driving material results. Even if you aren’t in the market for Health Catalyst solutions and services, you will be able to:
Think with more clarity through your approach to overcoming the current market challenges.
Reconsider the strategy you are employing to build cross-organizational awareness and support to put a data-first plan at the center of your plan.
Define action you can take today to assess your gaps, understand your options, and accelerate your progress to drive outcomes improvements.
Join us and you won’t be disappointed. Jared is one of those types of thinkers that many pay big money to listen to and it is our fortune to have 60 minutes with him to think deeply about moving healthcare forward, one patient at a time. We hope you can join us.
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
Building a Real-Time Analytics Application with
Apache Pulsar and Apache Pinot
While the demands for real-time analytics are growing in leaps and bounds, the analytics software must rely on streaming platforms for ingesting high volumes of data that's traveling in lightning speed down the pipeline. We will take a look at 2 powerful open source Apache platforms: Pulsar and Pinot, that work hand-in-hand together to deliver the analytical results which bring great value to your systems.
Presenters: Mary Grygleski - Streaming Developer Advocate &
Mark Needham - Developer Relations Engineer at StarTree
Note: This webinar will be recorded and later posted on our Webinar page (https://altinity.com/webinarspage/) or Altinity official Youtube channel (https://www.youtube.com/@Altinity).
Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job?
This session explores the DOs and DONTs. Separate sections explain when to use Kafka, when NOT to use Kafka, and when to MAYBE use Kafka.
No matter if you think about open source Apache Kafka, a cloud service like Confluent Cloud, or another technology using the Kafka protocol like Redpanda or Pulsar, check out this slide deck.
A detailed article about this topic:
https://www.kai-waehner.de/blog/2022/01/04/when-not-to-use-apache-kafka/
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies...HostedbyConfluent
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Developing custom transformation in the Kafka connect to minimize data redund...HostedbyConfluent
Compacted topics grow over time and are often utilizing high performance, low latency and relatively expensive storage solutions. Reducing duplicated data plays a critical role in the size of compacted topics. with less data on the topics, the Kafka cluster consumes less disk space which in turn it leads to lower operation cost.
in this use case-driven talk, we are going to demonstrate how our team at UnitedHealth Group leveraged existing transformers to extract data from the message metadata in the topic as well as how we developed our customized transformers to minimize the amount of duplicated data in each message in the topic.
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
A quick introduction to Apache NiFi and it's ecosystem. Also a hands on demo on using processors, examining provenance, ingesting REST Feeds, XML, Cameras, Files, Running TensorFlow, Running Apache MXNet, integrating with Spark and Kafka. Storing to HDFS, HBase, Phoenix, Hive and S3.
Data modelling is considered a staple in the world of data management. The skill of the data modeler and their knowledge of the business plays a large role in successful Enterprise Information Management across many organizations. Data modeling requires formal accountability, attention to metadata and getting the business heavily involved in data requirement development. These are all traits of solid Data Governance programs.
Join Bob Seiner and a special guest modeler extraordinaire in this month’s installment of Real-World Data Governance to discuss data modeling as a form of data governance. Learn how to use the skillfulness of the data modeler to advance data-as-an-asset and governance agendas while conveying the importance and value of both disciplines.
In this webinar Bob and a special guest will talk about:
•Data Modeling as Art or Science
•Role of Data Modeler in a Governance Program
•Data Modeler Skills as Governance Skills
•Modeling and Governance Best Practices
•Leveraging the Model as a Governance Artifact
The Connected Consumer – Real-time Customer 360Capgemini
With Business Data Lake technologies based on EMC’s Big Data portfolio it becomes possible to move away from channel specific analytics towards a 360 customer view.
This presentation will show how technologies like Spark, Hadoop, and Kafka help companies gain a real-time view of everything their customers do and make changes to customer touch points whether mobile, web, in-store, direct marketing or existing transactional systems.
Presented by Steve Jones, Vice President, Insights & Data, Capgemini at EMC World 2016
http://www.capgemini.com/emc
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
To remain competitive, organizations need to democratize access to fast analytics, not only to gain real-time insights on their business but also to power smart apps that need to react in the moment. In this session, you will learn how Kafka and SingleStore enable modern, yet simple data architecture to analyze both fast paced incoming data as well as large historical datasets. In particular, you will understand why SingleStore is well suited process data streams coming from Kafka.
Data Ingest Self Service and Management using Nifi and KafkaDataWorks Summit
We’re feeling the growing pains of maintaining a large data platform. Last year we went from 50 to 150 unique data feeds by adding them all by hand. In this talk we will share the best practices developed to handle our 300% increase in feeds through self service. Having self-service capabilities will increase your teams velocity and decrease your time to value and insight.
* Self service data feed design and ingest
* configuration management
* automatic debugging
* light weight data governance
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
Decentralized finance with crypto and NFTs is a huge topic these days. It becomes a powerful combination with the coming metaverse platforms across industries. This session explores the relationship between crypto technologies and modern enterprise architecture.
I discuss how data streaming and Apache Kafka help build innovation and scalable real-time applications of a future metaverse. Let's skip the buzz (and NFT bubble) and instead review existing real-world deployments in the crypto and blockchain world powered by Kafka and its ecosystem.
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
The Rise of Data in Motion in the Public Sector powered by event streaming with Apache Kafka.
Citizen Services:
- Health services, e.g. hospital modernization, track & trace - Covid distance control
- Public administration - reduce bureaucracy, data democratization across government departments
- eGovernment - Efficient and digital citizen engagement, e.g. personal ID application process
Smart City
- Smart driving, parking, buildings, environment
Waste management
- Open exchange – e.g. mobility services (1st and 3rd party)
Energy
- Smart grid and utilities infrastructure (energy distribution, smart home, smart meters, smart water, etc.)
- National Security
Law enforcement, surveillance, police/interior security data exchange
- Defense and military (border control, intelligent solider)
Cybersecurity for situational awareness and threat intelligence
The Rise of Data in Motion in the Healthcare Industry - Use Cases, Architectures and Examples powered by Apache Kafka.
Use Cases for Data in Motion in the Healthcare Industry:
- Know Your Patient (= “Customer 360”)
- Operations (Healthcare 4.0 including Drug R&D, Patient Care, etc.)
- IT Perspective (Cybersecurity, Mainframe Offload, Hybrid Cloud, Streaming ETL, etc)
Real-world examples include Covid-19 Electronic Lab Reporting, Cerner, Optum, Centene, Humana, Invitae, Bayer, Celmatix, Care.com.
Machine Learning with Apache Kafka in Pharma and Life SciencesKai Wähner
Blog Post:
https://www.kai-waehner.de/apache-kafka-event-streaming-pharmaceuticals-pharma-life-sciences-use-cases-architecture
Video Recording:
https://youtu.be/t2IH0brwGTg
AI/Machine learning and the Apache Kafka ecosystem are a great combination for training, deploying and monitoring analytic models at scale in real-time. They are showing up more and more in projects but still, feel like buzzwords and hype for science projects.
See how to connect the dots!
--How are Kafka and Machine Learning related?
--How can they be combined to productionize analytic models in mission-critical and scalable real-time applications?
--We will discuss a step-by-step approach to build a scalable and reliable real-time infrastructure for drug discovery doing data integration, feature engineering, image processing, model scoring and processing orchestration.
Use Cases:
R&D Engineering
Sales & Marketing
Manufacturing & Quality Assurance
Supply Chain
Product Monitoring & After Sales Support
VoC (Voice of Customer)
Single View Customer
Yield/Quality Optimization
Improved Drug Yield
Proactive Service Scheduling
Testing & Simulation
Drug Diversion
Process/Quality Monitoring
Inventory & Supply Chain Optimization
Proactive Service Offers
Patent Research and Analytics
Personalized Offers / Ads
EDW Offload
Supply Chain Network Design/Risk Management
Product Predictive Maintenance
Clinical Trials
Customer Segmentation
Smart Products
Serialization & e-Pedigree
Product Usage Tracking
GTM
Global Facilities
Inventory and Logistics Visibility
Warranty & Recall Management
Apache Kafka for Smart Grid, Utilities and Energy ProductionKai Wähner
The energy industry is changing from system-centric to smaller-scale and distributed smart grids and microgrids. A smart grid requires a flexible, scalable, elastic, and reliable cloud-native infrastructure for real-time data integration and processing. This post explores use cases, architectures, and real-world deployments of event streaming with Apache Kafka in the energy industry to implement smart grids and real-time end-to-end integration.
Blog Post with more details:
https://www.kai-waehner.de/apache-kafka-smart-grid-energy-production-edge-iot-oil-gas-green-renewable-sensor-analytics
Apache Kafka in Financial Services - Use Cases and ArchitecturesKai Wähner
The Rise of Event Streaming in Financial Services - Use Cases, Architectures and Examples powered by Apache Kafka.
The New FinServ Enterprise Reality: Every company is a software company. Innovate OR be Disrupted. Learn how Event Streaming with Apache Kafka and its ecosystem help...
More details:
https://www.kai-waehner.de/apache-kafka-financial-services-industry-banking-finserv-payment-fraud-middleware-messaging-transactions
https://www.kai-waehner.de/blog/2020/04/15/apache-kafka-machine-learning-banking-finance-industry/
https://www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
The manufacturing industry is moving away from just selling machinery, devices, and other hardware. Software and services increase revenue and margins. Equipment-as-a-Service (EaaS) even outsources the maintenance to the vendor.
This paradigm shift is only possible with reliable and scalable real-time data processing leveraging an event streaming platform such as Apache Kafka. This talk explores how Kafka-native Condition Monitoring and Predictive Maintenance help with this innovation.
More details:
https://www.kai-waehner.de/blog/2021/10/25/apache-kafka-condition-monitoring-predictive-maintenance-industrial-iot-digital-twin/
Video recording:
https://youtu.be/tfOuN5KeI9w
Supply Chain Optimization with Apache KafkaKai Wähner
Supply Chain optimization leveraging Event Streaming with Apache Kafka. See real-world use cases and architectures from Walmart, BMW, Porsche, and other enterprises to improve the Supply Chain Management (SCM) processes. Automation, robustness, flexibility, real-time, decoupling, data integration, and hybrid deployments...
Video recording: https://youtu.be/dUkgungBmPs
Blog post: https://www.kai-waehner.de/apache-kafka-supply-chain-management-scm-optimization-scor-six-sigma-real-time
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
Hybrid cloud architectures are the new black for most companies. A cloud-first strategy is evident for many new enterprise architectures, but some use cases require resiliency across edge sites and multiple cloud regions. Data streaming with the Apache Kafka ecosystem is a perfect technology for building resilient and hybrid real-time applications at any scale. This talk explores different architectures and their trade-offs for transactional and analytical workloads. Real-world examples include financial services, retail, and the automotive industry.
Video recording:
https://qconlondon.com/london2022/presentation/resilient-real-time-data-streaming-across-the-edge-and-hybrid-cloud
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...confluent
Apache Kafka is an open source event streaming platform. It is often used to complement or even replace existing middleware to integrate applications and build microservice architectures. Apache Kafka is already used in various projects in almost every bigger company today. Understood, battled-tested, highly scalable, reliable, real-time.
Blockchain is a different story. This technology is a lot in the news, especially related to cryptocurrencies like Bitcoin. But what is the added value for software architectures? Is blockchain just hype and adds complexity? Or will it be used by everybody in the future, like a web browser or mobile app today? And how is it related to an integration architecture and event streaming platform?
This session explores use cases for blockchains and discusses different alternatives such as Hyperledger, Ethereum and a Kafka-native tamper-proof blockchain implementation. Different architectures are discussed to understand when blockchain really adds value and how it can be combined with the Apache Kafka ecosystem to integrate blockchain with the rest of the enterprise architecture to build a highly scalable and reliable event streaming infrastructure.
Speakers:
Kai Waehner, Technology Evangelist, Confluent
Stephen Reed, CTO, Co-Founder, AiB
The rise of data in motion in the insurance industry is visible across all lines of business including life, healthcare, travel, vehicle, and others. Apache Kafka changes how enterprises rethink data. This blog post explores use cases and architectures for event streaming. Real-world examples from Generali, Centene, Humana, and Telsa show innovative insurance-related data integration and stream processing in real-time.
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology.
Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way integrating with various legacy and modern data sources and sinks.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
• The Automotive Industry (and it’s not only Connected Cars)
• Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
• Smart Cities (including citizen health services, communication infrastructure, …)
All these industries and sectors do not have new characteristics and requirements. They require data integration, data correlation or real decoupling, just to name a few, but are now facing massively increased volumes of data.
Real-time messaging solutions have existed for many years. Hundreds of platforms exist for data integration (including ETL and ESB tooling or specific IIoT platforms). Proprietary monoliths monitor plants, telco networks, and other infrastructures for decades in real-time. But now, Kafka combines all the above characteristics in an open, scalable, and flexible infrastructure to operate mission-critical workloads at scale in real-time. And is taking over the world of connecting data.
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...InfluxData
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Using MQTT, Kafka and InfluxDB 2.0 on Kubernetes | InfluxDays Virtual Experience London 2020
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
This face to face talk about Apache Flink in Sao Paulo, Brazil is the first event of its kind in Latin America! It explains how Apache Flink 1.0 announced on March 8th, 2016 by the Apache Software Foundation (link), marks a new era of Big Data analytics and in particular Real-Time streaming analytics. The talk maps Flink's capabilities to real-world use cases that span multiples verticals such as: Financial Services, Healthcare, Advertisement, Oil and Gas, Retail and Telecommunications.
In this talk, you learn more about:
1. What is Apache Flink Stack?
2. Batch vs. Streaming Analytics
3. Key Differentiators of Apache Flink for Streaming Analytics
4. Real-World Use Cases with Flink for Streaming Analytics
5. Who is using Flink?
6. Where do you go from here?
Apache Kafka® and Analytics in a Connected IoT Worldconfluent
Apache Kafka® and Analytics in a Connected IoT World, Kai Waehner, Sr. Solutions Engineer Advanced Technology Group, Confluent
https://www.meetup.com/Berlin-Apache-Kafka-Meetup-by-Confluent/events/273166575/
Apache Kafka and API Management / API Gateway – Friends, Enemies or Frenemies?Kai Wähner
Microservices became the new black in enterprise architectures. APIs provide functions to other applications or end users. Even if your architecture uses another pattern than microservices, like SOA (Service-Oriented Architecture) or Client-Server communication, APIs are used between the different applications and end users.
Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.
This session explores how event streaming with Apache Kafka and API Management (including API Gateway and Service Mesh technologies) complement and compete with each other depending on the use case and point of view of the project team. The session concludes exploring the vision of event streaming APIs instead of RPC calls.
Understand how event streaming with Kafka and Confluent complements tools and frameworks such as Kong, Mulesoft, Apigee, Envoy, Istio, Linkerd, Software AG, TIBCO Mashery, IBM, Axway, etc.
A Streaming API Data Exchangeprovides streaming replication between business units and companies. API Management with REST/HTTP is not appropriate for streaming data.
With businesses today needing to store a lot more data and for longer periods of time, while also empowering their customers to analyze it in real time and in unexpected ways, analytical workloads are fast exceeding what traditional databases and data warehouses are capable of. In this session, MariaDB's Shane Johnson describes modern analytics requirements and employs real-world use cases to show how MariaDB ColumnStore is helping customers meet these new requirements.
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
Streaming all over the World: Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka.
Learn about various case studies for event streaming with Apache Kafka across industries. The talk explores architectures for real-world deployments from Audi, BMW, Disney, Generali, Paypal, Tesla, Unity, Walmart, William Hill, and more. Use cases include fraud detection, mainframe offloading, predictive maintenance, cybersecurity, edge computing, track&trace, live betting, and much more.
Apache Kafka in the Airline, Aviation and Travel IndustryKai Wähner
Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. Coronavirus is just a piece of the challenge.
This presentation explores use cases, architectures, and references for Apache Kafka as event streaming technology in the aviation industry, including airline, airports, global distribution systems (GDS), aircraft manufacturers, and more.
Examples include Lufthansa, Singapore Airlines, Air France Hop, Amadeus, and more. Technologies include Kafka, Kafka Connect, Kafka Streams, ksqlDB, Machine Learning, Cloud, and more.
Similar to Apache Kafka in the Healthcare Industry (20)
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
Enterprise integration is more challenging than ever before. The IT evolution requires the integration of more and more technologies. Applications are deployed across the edge, hybrid, and multi-cloud architectures. Traditional middleware such as MQ, ETL, ESB does not scale well enough or only processes data in batch instead of real-time.
This presentation explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS (Integration Platform as a Service) solutions, and why event streaming is a new software category.
A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively cloud-native iPaaS.
Video Recording of this presentation:
https://www.youtube.com/watch?v=I8yZwKg_IJc&t=2842s
Blog post about this topic:
https://www.kai-waehner.de/blog/2021/11/03/apache-kafka-cloud-native-ipaas-versus-mq-etl-esb-middleware/
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
Today, in 2022, Apache Kafka is the central nervous system of many applications in various areas related to the automotive and manufacturing industry for processing analytical and transactional data in motion across edge, hybrid, and multi-cloud deployments.
This presentation explores the automotive event streaming landscape, including connected vehicles, smart manufacturing, supply chain optimization, aftersales, mobility services, and innovative new business models.
Afterwards, many real-world examples are shown from companies such as Audi, BMW, Porsche, Tesla, Uber, Grab, and FREENOW.
More detail in the blog post:
https://www.kai-waehner.de/blog/2022/01/12/apache-kafka-landscape-for-automotive-and-manufacturing/
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
Real-time data beats slow data. That’s true for almost every use case. Nevertheless, enterprise architects build new infrastructures with the Lambda architecture that includes separate batch and real-time layers.
This video explores why a single real-time pipeline, called Kappa architecture, is the better fit for many enterprise architectures. Real-world examples from companies such as Disney, Shopify, Uber, and Twitter explore the benefits of Kappa but also show how batch processing fits into this discussion positively without the need for a Lambda architecture.
The main focus of the discussion is on Apache Kafka (and its ecosystem) as the de facto standard for event streaming to process data in motion (the key concept of Kappa), but the video also compares various technologies and vendors such as Confluent, Cloudera, IBM Red Hat, Apache Flink, Apache Pulsar, AWS Kinesis, Amazon MSK, Azure Event Hubs, Google Pub Sub, and more.
Video recording of this presentation:
https://youtu.be/j7D29eyysDw
Further reading:
https://www.kai-waehner.de/blog/2021/09/23/real-time-kappa-architecture-mainstream-replacing-batch-lambda/
https://www.kai-waehner.de/blog/2021/04/20/comparison-open-source-apache-kafka-vs-confluent-cloudera-red-hat-amazon-msk-cloud/
https://www.kai-waehner.de/blog/2021/05/09/kafka-api-de-facto-standard-event-streaming-like-amazon-s3-object-storage/
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
I see the following topics coming up more regularly in conversations with customers, prospects, and the broader Kafka community across the globe:
Kappa Architecture: Kappa goes mainstream to replace Lambda and Batch pipelines (that does not mean that there is no batch processing anymore). Examples: Kafka-powered Kappa architectures from Uber, Disney, Shopify, and Twitter.
Hyper-personalized Omnichannel: Retail and customer communication across online and offline channels becomes the new black, including context-specific upselling, recommendations, and location-based services. Examples: Omnichannel Retail and Customer 360 in Real-Time with Apache Kafka.
Multi-Cloud Deployments: Business units and IT infrastructures span across regions, continents, and cloud providers. Linking clusters for bi-directional replication of data in real-time becomes crucial for many business models. Examples: Global Kafka deployments.
Edge Analytics: Low latency requirements, cost efficiency, or security requirements enforce the deployment of (some) event streaming use cases at the far edge (i.e., outside a data center), for instance, for predictive maintenance and quality assurance on the shop floor level in smart factories. Examples: Edge analytics with Kafka.
Real-time Cybersecurity: Situational awareness and threat intelligence need to process massive data in real-time to defend against cyberattacks successfully. The many successful ransomware attacks across the globe in 2021 were a warning for most CIOs. Examples: Cybersecurity for situational awareness and threat intelligence in real-time.
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
Technical thought leadership presentation to discuss how leading organizations move to real-time architecture to support business growth and enhance customer experience. This is a forum to discuss use cases with your peers to understand how other digital-native companies are utilizing data in motion to drive competitive advantage.
Agenda:
- Data in Motion with Event Streaming and Apache Kafka
- Streaming ETL Pipelines
- IT Modernisation and Hybrid Multi-Cloud
- Customer Experience and Customer 360
- IoT and Big Data Processing
- Machine Learning and Analytics
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
The Era of Telco 4.0: Embracing Digital Transformation with Data in Motion. Learn about Payment and FinServ Integration for Data in Motion with 5G and Apache Kafka.
1) The rise of Telco 4.0 and the future forward
2) Data in Motion in the Telco industry
3) Real-world Fintech and Payment examples powered by Data in Motion
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationKai Wähner
Data in Motion powered by the Apache Kafka ecosystem for Situational Awareness, Threat Detection, Forensics, Zero Trust Zones and Air-Gapped Environments.
Agenda:
1) Cybersecurity in 202X
2) Data in Motion as Cybersecurity Backbone
3) Situational Awareness
4) Threat Intelligence
5) Forensics
6) Air-Gapped and Zero Trust Environments
7) SIEM / SOAR Modernization
More details in the "Kafka for Cybersecurity" blog series:
https://www.kai-waehner.de/blog/2021/07/02/kafka-cybersecurity-siem-soar-part-1-of-6-data-in-motion-as-backbone/
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
Connect all the things: An intro to event streaming for the automotive industry including connected cars, mobility services, and manufacturing / industrial IoT.
Video recording of this talk: https://www.youtube.com/watch?v=rBfBFrcO-WU
The Fourth Industrial Revolution (also known as Industry 4.0) is the ongoing automation of traditional manufacturing and industrial practices, using modern smart technology. Event Streaming with Apache Kafka plays a massive role in processing massive volumes of data in real-time in a reliable, scalable, and flexible way using integrating with various legacy and modern data sources and sinks.
Other industries—retail, healthcare, government, financial services, energy, and more—also lean into Industry 4.0 technology to take advantage of IoT devices, sensors, smart machines, robotics, and connected data. The variety of these deployments goes from disconnected edge use cases across hybrid architectures to global multi-cloud deployments.
In this presentation, I want to give you an overview of existing use cases for event streaming technology in a connected world across supply chains, industries and customer experiences that come along with these interdisciplinary data intersections:
- The Automotive Industry (and it’s not only Connected Cars)
- Mobility Services across verticals (transportation, logistics, travel industry, retailing, …)
- Smart Cities (including citizen health services, communication infrastructure, …)
Real-world examples include use cases from car makers such as Audi, BMW, Porsche, Tesla, plus many examples from mobility services such as Uber, Lyft, Here Technologies, and more.
Serverless Kafka on AWS as Part of a Cloud-native Data Lake ArchitectureKai Wähner
AWS Data Lake / Lake House + Confluent Cloud for Serverless Apache Kafka. Learn about use cases, architectures, and features.
Data must be continuously collected, processed, and reactively used in applications across the entire enterprise - some in real time, some in batch mode. In other words: As an enterprise becomes increasingly software-defined, it needs a data platform designed primarily for "data in motion" rather than "data at rest."
Apache Kafka is now mainstream when it comes to data in motion! The Kafka API has become the de facto standard for event-driven architectures and event streaming. Unfortunately, the cost of running it yourself is very often too expensive when you add factors like scaling, administration, support, security, creating connectors...and everything else that goes with it. Resources in enterprises are scarce: this applies to both the best team members and the budget.
The cloud - as we all know - offers the perfect solution to such challenges.
Most likely, fully-managed cloud services such as AWS S3, DynamoDB or Redshift are already in use. Now it is time to implement "fully-managed" for Kafka as well - with Confluent Cloud on AWS.
Building a central integration layer that doesn't care where or how much data is coming from.
Implementing scalable data stream processing to gain real-time insights
Leveraging fully managed connectors (like S3, Redshift, Kinesis, MongoDB Atlas & more) to quickly access data
Confluent Cloud in action? Let's show how ao.com made it happen!
Translated with www.DeepL.com/Translator (free version)
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
The Rise of Data in Motion powered by Event Streaming - Use Cases and Architecture for IBM Cloud Pak with Confluent Platform. Including screenshots of the live demo (integration between IBM and Kafka via Confluent Platform and Kafka Connect connectors).
Learn about the integration capabilities of IBM Cloud Pak for Integration, now with the industry’s leading event streaming platform from Confluent Platform powered by Apache Kafka.
Apache Kafka and MQTT - Overview, Comparison, Use Cases, ArchitecturesKai Wähner
Apache Kafka and MQTT are a perfect combination for many IoT use cases. This presentation covers the pros and cons of both technologies. Various use cases across industries, including connected vehicles, manufacturing, mobility services, and smart city are explored. The examples use different architectures, including lightweight edge scenarios, hybrid integrations, and serverless cloud solutions.
Blog series with more details here:
https://www.kai-waehner.de/blog/2021/03/15/apache-kafka-mqtt-sparkplug-iot-blog-series-part-1-of-5-overview-comparison/
Connected Vehicles and V2X with Apache KafkaKai Wähner
This session discusses uses cases leveraging Apache Kafka open source ecosystem as streaming platform to process IoT data.
See use cases, architectural alternatives and a live demo of how devices connect to Kafka via MQTT. Learn how to analyze the IoT data either natively on Kafka with Kafka Streams/KSQL, or on an external big data cluster like Spark, Flink or Elastic leveraging Kafka Connect, and how to leverage TensorFlow for Machine Learning.
The focus is on connected cars / connected vehicles and V2X use cases respectively mobility services.
A live demo shows how to build a cloud-native IoT infrastructure on Kubernetes to connect and process streaming data in real-time from 100.000 cars to do predictive maintenance at scale in real-time.
Code for the live demo on Github:
https://github.com/kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more. This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good fit and when it is not.
The discussion includes different Kafka-native add-ons like Tiered Storage for long-term, cost-efficient storage and ksqlDB as event streaming database. The relation and trade-offs between Kafka and other databases are explored to complement each other instead of thinking about a replacement. This includes different options for pull and push-based bi-directional integration.
Key takeaways:
- Kafka can store data forever in a durable and high available manner
- Kafka has different options to query historical data
- Kafka-native add-ons like ksqlDB or Tiered Storage make Kafka more powerful than ever before to store and process data
- Kafka does not provide transactions, but exactly-once semantics
- Kafka is not a replacement for existing databases like MySQL, MongoDB or Elasticsearch
- Kafka and other databases complement each other; the right solution has to be selected for a problem
- Different options are available for bi-directional pull and push-based integration between Kafka and databases to complement each other
Video Recording:
https://youtu.be/7KEkWbwefqQ
Blog post:
https://www.kai-waehner.de/blog/2020/03/12/can-apache-kafka-replace-database-acid-storage-transactions-sql-nosql-data-lake/
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
1. The Rise of Data in Motion in the Healthcare Industry
Use Cases, Architectures and Examples powered by Apache Kafka
Kai Waehner
Field CTO
kai.waehner@confluent.io
linkedin.com/in/kaiwaehner
@KaiWaehner
confluent.io
kai-waehner.de
2. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare includes many topics…
https://isilanguagesolutions.com/2019/02/25/what-are-the-differences-between-health-care-medical-life-science-and-pharmaceutical-translations/
3. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare Value Chain
4
https://www.researchgate.net/publication/265654743_The_business_of_healthcare_innovation_in_the_Wharton_School_curriculum
4. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
The world is changing.
5. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
“Pandemic drives digital
adoption forward 5 years
in a span of 8 weeks.”
Digital adoption through COVID and beyond, McKinsey
Covid Increases the Pressure
6
6. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital health
ecosystems: A payer
perspective
- McKinsey Article August
2019
Digital
Health
Ecosystem
Disruption
7. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
This transformation is
happening everywhere
8. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Doctors become Software
9. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Medical Research becomes Software
10. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Patient Data becomes Software
11. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Security becomes Software
12. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Healthcare Companies and Organizations
13. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What enables this
transformation?
14. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real-time Data beats Slow Data.
19
15. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real-time Data beats Slow Data.
Emergency
Real-time sensor
diagnostics
Intelligent routing
ETA updates
Patient Care
Diagnosis
Treatment
Connected Health
Insurance
Member Enrollment
Claim processing
Omnichannel
patient experience
Cybersecurity
Threat detection
Incident response
Data privacy
protection
16. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
This is a fundamental paradigm shift...
21
Infrastructure
as code
Data in Motion
as continuous
streams of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
17. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What is Data in Motion?
18. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
‘Event’ is what happens in your business
Transportation
GPS in the ambulance sends ETA to the hospital at 5:11am.
Kafka
Insurance Claim
Alice filed a healthcare insurance claim Friday at 7:34pm.
Kafka
Patient Interaction
The doctor updates Sabine’s case status at 9:10am.
Kafka
19. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data in Motion in the Healthcare Industry
Your Business as Streams of Events, powered by Kafka
Insurance Claim
Processing
Contact
Relatives
Patient
Diagnosis
Surgery
Ambulance
Emergency
Situation
20. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
An Event Streaming Platform is the
Underpinning of an Event-driven Architecture
25
MES
ERP
Sensors
Mobile
Customer 360
Real-time
Alerting System
Data
warehouse
Producers
Consumers
Streams of real time events
Stream processing
apps
Connectors
Connectors
Stream processing
apps
Supplier
Alert
Forecast
Inventory Customer
Order
21. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
With Data in Motion…
Hadoop ... Device
Logs ... App ...
Microservice
Mainframes
Data
Warehouse Splunk ...
Data Stores Logs 3rd Party Apps Custom Apps / Microservices
Supply Chain
Management
Medical Fraud
Detection
Patient &
Beneficiary 360
Disease Spread
Modeling
HL Data
Transformation ...
Contextual Event-Driven Applications
Universal Event Pipeline
22. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Public Health Data Automation in Confluent
28
Connectors:
CDC
MQ
REST Proxy
EDI / Batch Input
Processing
Legacy Data
Storage and
Processing
Claims Clinical
Schema
Registry
ksqlDB / Streams
HL7-FHIR
MicroServices
Analytics
Sink Connector
Sinks
23. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Example: Benefits application process
Software-using
1 3 5
4 6
2
BENEFICIARY FORM
INTAKE
CASE
MANAGER
APPLICATION
REVIEW
BENEFITS
APPLICATION
APPROVE
DENY
Software-defined
1
BENEFICIARY BENEFITS
APP UI
3
APPROVE
DENY
$
BENEFITS
SERVICE
RISK/FRAUD
SERVICE
!
EXTERNAL
AGENCY
SERVICE
2
Weeks
Seconds
24. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Cases for Data in Motion in the Healthcare Industry
31
Know Your Patient (= “Customer 360”)
● Digital Transformation
● eCommerce Optimization
● Product Catalog Optimization
● Product-Inventory Profiling and
Filtering by Customer or Persona
● Real-time Pricing Models
● Next Best Offer/Cross-Sell/
Recommendations
● Omni-Channel Experience
● Customer Profile Updates
● …
Operations (Healthcare 4.0 including
Drug R&D, Patient Care, etc.)
● Supply Chain Optimization
● Shipment Notifications/Delays
● Inventory Processing and
Oversight
● Predictive Inventory Management
● Connected Health
● Improved Care
● Proactive Patient Care
● Patient Notifications
● Pharma Modernization
● M&A Rapid Integration
● …
IT Perspective
● Cybersecurity/
SIEM Optimization
● Mainframe Offload
● Hybrid Cloud Integration/ Bridge
to Cloud
● Middleware/
Messaging Modernization
● Streaming ETL & Analytics
● …
25. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Real World Deployments
26. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
27. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
28. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Optum – Self-Service Kafka
American pharmacy benefit manager and health care provider
(subsidiary of UnitedHealth Group)
Kafka as a Service within UnitedHealth Group
Centrally managed and utilized by over 200 internal application
teams
Repeatable, scalable, cost-efficient way to standardize data
From mainframe via CDC into modern data processing and
analytics tools
29. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Centene
Integration and Data Processing at Scale in Real-Time
Healthcare Insurer acts as intermediary for both government-sponsored and privately insured health care programs
Largest Medicaid and Medicare Managed Care Provider in the US
https://www.confluent.io/online-talks/building-an-enterprise-eventing-framework-on-demand/
30. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Bayer AG – Hybrid Real-Time Data Flow
Adopted a cloud first strategy and started a multi-year transition to the cloud.
Kafka-based cross-datacenter DataHub was created to facilitate migration and to drive shift to real-time stream processing.
Strong enterprise adoption and supports a myriad of use cases
41
https://www.confluent.io/kafka-summit-sf18/bringing-streaming-data-to-the-masses
31. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
32. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Bayer AG – Data Integration and Processing in R&D
Analysis of clinical trials, patents, reports, news, literature, etc.
250M documents, 7TB raw text from 30 data sources.
Variety of document streams with different formats and schemas flowing through several text processing and enrichment steps.
Scalable, reliable Kafka pipelines with Kafka Streams (Java) and Faust (Python) replaced custom, error-prone, non-scalable scripts.
43
https://www.kafka-summit.org/sessions/bayer-document-stream-pipelines
33. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Babylon Health – Secure and Agile Integration
Connectivity + Agile Microservice Architecture.
GDPR and PII compliant security.
44
https://www.confluent.io/kafka-summit-lon19/one-key-to-rule-them-all
34. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
35. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Cerner – Sepsis Alerting
Supplier of health information technology services, devices, and hardware
~30% of all US Healthcare Data in a Cerner Solution
Central event streaming platform for sepsis alerting in real-time to save lives
36. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Celmatix - Reproductive Health Care
47
https://www.confluent.io/customers/celmatix/
Preclinical-stage biotech company that provides
digital tools and genetic insights focused on fertility.
Personalized information to disrupt how women
approach their lifelong reproductive health journey.
Real-time aggregation of heterogeneous data data
collected from Electronic Medical Records (EMRs)
and genetic data collected from partners through
their Personalized Reproductive Medicine (PReM)
Initiative.
Proactive reproductive health decisions by leveraging
real-time genomics data and applying technologies
such as big data analytics, machine learning, A/I and
whole-genome DNA sequencing
Data governance for security and compliance.
37. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Centers for Disease Control and Prevention (CDC):
Covid-19 Electronic Lab Reporting
https://www.confluent.io/resources/kafka-summit-2020/flattening-the-curve-with-kafka/
CELR
(COVID Electronic Lab Reporting)
Case notifications, lab reporting,
healthcare interoperability in real-time
Track the threat of COVID-19 virus to
provide comprehensive data for local,
state, and federal response
Better understand locations with an
increase in incidence
Rapidly aggregate, validate, transform,
and distribute laboratory testing data
submitted by public health departments
and other partners
38. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
39. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Recursion – Discovering Drugs in Real-Time
Accelerate drug discovery.
Find drug treatments by processing biological images.
Massively parallel system.
Combines experimental biology, artificial intelligence,
automation and real-time event streaming.
50
https://www.confluent.io/customers/recursion
https://www.confluent.io/kafka-summit-san-francisco-2019/discovering-drugs-with-kafka-streams
40. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Humana – Real-Time Integration and Analytics
Interoperability platform to transition from Insurance Company with Elements of Health,
to truly a Health Company with Elements of Insurance.
Consumer-centric, health plan agnostic, provider agnostic. Cloud resilient and elastic. Event-driven and real-time.
Inter organization data sharing (aka “data exchange / data sharing”)
Use cases include real-time updates of health information (Connecting HCP’s -> Pharmacies), reducing pre-authorizations from 20-
30 minutes to 1 minute, real-time home healthcare assistant communication
51
https://www.confluent.io/resources/kafka-summit-2020/levi-bailey-keynote-humana-improving-health-with-event-driven-architectures/
41. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
1. Legacy Modernization and Hybrid Cloud
2. Streaming ETL
3. Real-time Analytics
4. Machine Learning and Data Science
5. Open API and Omnichannel
Data in Motion across the Healthcare Value Chain
42. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Care.com – Trusted Caregivers
53
Online marketplace for a range of care services including senior care and housekeeping
Bravo Platform as simple, unified IT architecture to be able to streamline go-to-market initiatives
From a monolithic architecture into a truly decoupled, scalable microservices platform
Migration from Confluent Platform to Confluent Cloud to focus on business problems
Data Governance with Schema Registry across different run times (Java, .NET, Go, etc.)
“Care APIs” (inspired by Google APIs) to define all of their data and service contracts with Protobuf
Enhance security for PII data with fine-grained RBAC and data lineage
https://www.confluent.io/customers/care-com/
43. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Invitae – Data Science and 24/7 Production
Biotechnology company that provides DNA-based testing for the detection of genetic abnormalities beyond
what can be identified through traditional methodologies
Gene panels and single-gene testing for a broad range of clinical areas including
hereditary cancer, cardiology, neurology, pediatric genetics, metabolic disorders, immunology, hematology.
Bring comprehensive genetic information into mainstream medical practice
to improve the quality of healthcare for billions of people.
Omnichannel: Genetic results are often just the beginning. Invitae's interactive, educational portal and caring
gentic counselors can help you understand your results and what to do next.
Truly decoupled infrastructure to enable others to join in and consume the data.
Paradigm shift: Building an application entirely of streams.
54
https://www.confluent.io/kafka-summit-san-francisco-2019/from-zero-to-streaming-healthcare-in-production
44. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
What is Data Streaming with the
Apache Kafka Ecosystem?
45. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Kafka: The Trinity of Event Streaming
01
Publish & Subscribe
to Streams of Events
02
Store
your Event Streams
03
Process & Analyze
your Events Streams
46. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Kafka Makes Your Business Real-time
CREATE STREAM payments (user VARCHAR, amount INT)
WITH (kafka_topic = 'all_payments', value_format = 'avro');
CREDIT
SERVICE
ksqlDB
CREATE TABLE credit_scores AS
SELECT user, updateScore(p.amount) AS credit_score
FROM payments AS p
GROUP BY user
EMIT CHANGES;
RISK
SERVICE
ksqlDB
47. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Databases
Messaging
ETL / Data Integration
Data Warehouse
Why can’t I do this with my
existing data platforms?
48. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Enterprise Data Platform Requirements Are Shifting
1 3 4
2
Scalable for
Transactional Data
Transient Raw data
Built for
Historical Data
Built for Real-
Time Events
Scalable for
ALL data
Persistent +
Durable
Enriched
data
● Value: Trigger real-
time workflows (i.e.
real-time order
management)
● Value: Scale across
the enterprise (i.e.
customer 360)
● Value: Build
mission-critical
apps with zero data
loss (i.e. instant
payments)
● Value: Add context &
situational awareness
(i.e. ride sharing ETA)
62
49. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Only Event Streaming Has All 4 Requirements
63
50. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Only Event Streaming Has All 4 Requirements
Messaging
Databases
Event Streaming
Data Warehouse
BUILT FOR REAL-
TIME EVENTS
SCALABLE
FOR ALL DATA
PERSISTENT &
DURABLE
CAPABLE OF
ENRICHMENT
64
Good for transactional applications
Good for ultra low-latency, fire-and-forget use cases
Good for batch data integration
Good for historical analytics and reporting
Platform for Event-Driven Transformation
(Scalable Messaging + Real-Time Data Integration + Stream Processing)
ETL/Data Integration
51. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Project Example:
Drug Discovery
52. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Case: Drug Discovery
“On average, it takes at least ten
years for a new medicine to
complete the journey from initial
discovery to the marketplace”
PhRMA
http://phrma-docs.phrma.org/sites/default/files/pdf/rd_brochure_022307.pdf
53. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Recursion – Discovering Drugs in Real-Time
Accelerate drug discovery.
Find drug treatments by processing biological images.
Massively parallel system.
Combines experimental biology, artificial intelligence,
automation and real-time event streaming.
70
https://www.confluent.io/customers/recursion
https://www.confluent.io/kafka-summit-san-francisco-2019/discovering-drugs-with-kafka-streams
54. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Image and Video Processing
… (on high level) is “just” pixels (arrays of 0s and 1s) and matrix multiplication
55. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Drug Discovery
in manual and slow, bursty batch mode, not scalable
56. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Drug Discovery
in automated, scalable, reliable real time Mode
57. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image Processing for Drug Discovery
Find drug treatments by processing biological images:
• ML models can be trained to decide between healthy cells and disease
cells with problematic genes
• Grow healthy cells and disease cells in labs
• Apply different drugs à Make disease cells look healthy again
58. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image
Processing
(OpenCV
SaaS Service
REST API)
Kafka, ksqlDB and TensorFlow for
Drug Discovery in Real Time at Scale
Kafka Client
(.NET C++)
Batch
Reporting
Platform
BI
Dashboard
Confluent
Server
Tiered Storage
Kafka
Connect
Laboratory
(Windows Machines)
Confluent Platform
Other Components
Model Training
and Scoring
(Python Client +
TensorFlow)
All Data
Processed
Images
Images
Human
Intelligence
Streaming
ETL
(ksqlDB)
Stateful
Workflow
Orchestration
(Kafka Streams)
Database
(MySQL) Kafka Connect
(Oracle CDC)
Historical Drugs Data
59. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Ingestion of Images
Replication
Cluster Linking
Kafka
Connect
Laboratory
60. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data Preprocessing
Preprocessing
Filter, transform, anonymize, extract features,
reduce noise, enhance brightness / contrast
Streams
Data Ready
For Model Training
61. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
SELECT image_id, experiment_id, image_details
FROM image_channel i
LEFT JOIN experiment_database e ON i.experiment_id =
e.experiment_id
WHERE e.image_type = ‘black_and_white';
Data Processing with ksqlDB
62. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Direct streaming ingestion
for model training and / or scoring
with TensorFlow I/O + Kafka Plugin
(no additional data storage
like S3 or HDFS required!)
Time
Model B
Model A
Producer
Distributed Commit Log
Streaming Ingestion and Model Training
with TensorFlow IO
https://github.com/tensorflow/io
63. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Confluent Tiered Storage for Kafka
85
64. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Use Cases for Reprocessing Historical Events
Give me all events from time A to time B
Real-time Producer
Time
• New consumer application
• Error-handling
• Compliance / regulatory processing
• Query and analyze existing events
• Schema changes in analytics platform
• Model training
Real-time Consumer
Consumer of Historical Data
65. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Local Predictions
Model Training
in Cloud
Model Deployment
at the Edge
Analytic Model
Separation of
Model Training and Model Inference
66. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Streams
Input Event
Prediction
Request
Response
Model Serving
TensorFlow Serving
gRPC / HTTP
Application
Stream Processing with External Model and RPC
Model
67. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
“CREATE STREAM ImageAnalysis AS
SELECT image_id, analyzeImage(image_details)
FROM image_channel;“
User Defined Function (UDF)
Embedded Model Deployment with
Apache Kafka, ksqlDB and TensorFlow
68. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Model Training and Scoring
with the same ML Pipeline (or even in the same Application)
• Data Science team responsible for the whole model lifecycle
• Beloved Python tool stack (Pandas, scikit learn, TensorFlow, Jupyter, …)
• 24/7 production scale with Confluent Python Client (e.g. deployed in Docker containers on Kubernetes)
69. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Digital Image
Processing
(External SaaS
Service + REST)
Kafka, ksqlDB and TensorFlow for
Drug Discovery in Real Time at Scale
Kafka Client
(.NET C++)
Batch
Reporting
Platform
BI
Dashboard
Confluent
Server
Tiered Storage
Kafka
Connect
Laboratory
(Windows Machines)
Confluent Platform
Other Components
Model Training
and Scoring
(Python Client +
TensorFlow)
All Data
Processed
Images
Images
Human
Intelligence
Streaming
ETL
(ksqlDB)
Stateful
Workflow
Orchestration
(Kafka Streams)
Database
(MySQL) Kafka Connect
(Oracle CDC)
Historical Drugs Data
70. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Data in Motion Is The Future Of Data
92
Infrastructure
as code
Data in motion
as continuous
streams of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
71. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Why Confluent?
72. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
The Rise of Data in Motion
2010
Apache Kafka
created at LinkedIn by
Confluent founders
2014
2020
80%
Fortune 100
Companies
trust and use
Apache Kafka
95
73. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
I N V E S T M E N T & T I M E
V
A
L
U
E
3
4
5
1
2
Event Streaming Maturity Model
Initial Awareness /
Pilot (1 Kafka
Cluster)
Start to Build
Pipeline / Deliver 1
New Outcome
(1 Kafka Cluster)
Mission-Critical
Deployment
(Stretched, Hybrid,
Multi-Region)
Build Contextual
Event-Driven Apps
(Stretched, Hybrid,
Multi-Region)
Central Nervous
System
(Global Kafka)
Product, Support, Training, Partners, Technical Account Management...
96
74. Data in Motion with Apache Kafka in the Healthcare Industry – @KaiWaehner - www.kai-waehner.de
Car Engine Car Self-driving Car
Confluent completes Apache Kafka. Cloud-native. Everywhere.