SlideShare a Scribd company logo
1 of 36
Download to read offline
©2022, Imply
©2022, imply
The Trifecta of Real-Time Applications:
Apache Kafka, Flink, and Druid
1
©2022, Imply
Speakers
2
Gian Merlino
Co-founder and CTO of Imply
PMC Chair of Apache Druid
Kai Wähner
Field CTO of Confluent
©2022, Imply 3
Agenda 1. Overview of real-time applications
2. Kafka-Flink-Druid data architecture
3. Introduction to Apache Flink+use cases
4. Introduction to Apache Druid+use cases
5. Kafka-Flink-Druid case studies
6. Confluent and Imply integration
7. Q&A
©2022, Imply
©2021, imply
4
Data applications are going real-time
©2022, Imply 5
REAL-TIME APPLICATION
● >5 million events per second
● Interactive queries
● Customer-facing
● 250 queries per second
● Complex analytics
● Operational visibility
©2022, Imply 6
REAL-TIME APPLICATION
● Real-time monitoring
● Real-time alerting
● Real-time decisioning
● Interactive queries
● High dimensionality
©2022, Imply 7
REAL-TIME APPLICATION
● Real-time decisioning
● API-driven interactive queries
● Customer-facing
● 25 million users
● <100 ms query latency
● 100+ QPS
©2022, Imply 8
Building real-time apps with batch workflows doesn’t work
Waiting…
Data Collection
Data Processing
Data Ingestion
Data Analysis
Data Presentation
1 3
5
2 4
Waiting…
Waiting…
Waiting…
Waiting…
Latency measured in hours-to-days
©2022, Imply 9
Open source real-time data architecture
Data
Sources
Streaming Pipeline
Events/
Messages
Stream Processing
Real-Time
Analytics
Monitor/Alerting
Analytics
Visualization/
Decisioning
Real-Time
Applications
Historical Data / Context
+
Enrichment /
Transformation
©2022, Imply 10
timestamp sensor_id temperature
2023-07-10T10:00:00 SensorA 22.5
2023-07-10T10:01:00 SensorB 18.2
2023-07-10T10:02:00 SensorC 65.5
timestamp sensor_id location
2023-07-10T10:00:00 SensorA Room 101
2023-07-10T10:01:00 SensorB Room 101
2023-07-10T10:02:00 SensorC Room 101
Source
A common real-time KFD pattern
timestamp sensor_id location
tempe
rature
2023-07-10T10:00:00 SensorA Room 101 22.5
2023-07-10T10:01:00 SensorB Room 101 18.2
2023-07-10T10:02:00 SensorC Room 101 65.5
65.5
Current
©2022, Imply
©2021, imply
11
Introduction to Apache Flink
Flink growth has
mirrored the growth of
Kafka, the de facto
standard for streaming
data
>75% of the Fortune 500 estimated to
be using Kafka
>100,000+ orgs using Kafka
>41,000 Kafka meetup attendees
>750 Kafka Improvement Proposals
>12,000 Jiras for Apache Kafka
0
50,000
100,000
150,000
2020 2021 2022
2016 2017 2018
Flink
Kafka
Two Apache Projects, Born a
Few Years Apart
Monthly Unique Users
Developers choose Flink because of its
performance and rich feature set
Scalability and
Performance
Fault
Tolerance
Flink is a top 5 Apache project and boasts a robust developer community
Unified
Processing
Flink is capable of
supporting stream
processing workloads
at tremendous scale
Language
Flexibility
Flink's fault tolerance
mechanisms ensure it
can handle failures
effectively and provide
high availability
Flink supports Java,
Python, & SQL,
enabling developers to
work in their language
of choice
Flink supports stream
processing, batch
processing, and ad-hoc
analytics through one
technology
Flink supports unified stream and batch processing
● Entire pipeline must always be running ● Execution proceeds in stages, running as needed
● Input must be processed as it arrives ● Input may be pre-sorted by time and key
● Results are reported as they become ready ● Results are reported at the end of the job
● Failure recovery resumes from a recent snapshot ● Failure recovery does a reset and full restart
● Flink guarantees effectively exactly-once results
despite out-of-order data and restarts due to
failures, etc.
● Effectively exactly-once guarantees are more
straightforward
Effortlessly filter, join, and enrich your data streams with Apache Flink
Real-time processing
Power low-latency applications and pipelines that react to
real-time events and provide timely insights
Data reusability
Share consistent and reusable data streams widely with
downstream applications and systems
Data enrichment
Curate, filter, and augment data on-the-fly with additional
context to improve completeness, accuracy, & compliance
Efficiency
Improve resource utilization and cost-effectiveness by
avoiding redundant processing across silos
“With Confluent’s fully managed Flink offering, we can access, aggregate, and enrich data from IoT sensors,
smart cameras, and Wi-Fi analytics, to swiftly take action on potential threats in real time, such as intrusion
detection. This enables us to process sensor data as soon as the events occur, allowing for faster detection and
response to security incidents without any added operational burden.”
Process data streams in-flight to maximize actionability, fidelity, and portability
Blob
storage
3rd party
app
Databases Data
Warehouse
Database
SaaS app
Low latency apps
and data pipelines
Consistent, reusable
data products
Optimized resource
utilization
Enrich real-time data streams with Generative
AI directly from Flink SQL
INSERT INTO enriched_reviews
SELECT id
, review
,
invoke_openai(prompt,review) as
score
FROM product_reviews
;
K
N
Kate
4 hours ago
This was the worst decision ever.
Nikola
1 day ago
Not bad. Could have been cheaper.
K
N
B
Kate
★★★★★ 4 hours ago
This was the worst decision ever.
Nikola
★★★★★ 1 day ago
Not bad. Could have been cheaper.
Brian
★★★★★ 3 days ago
Amazing! Game Changer!
The Prompt
“Score the following text on a scale of 1
and 5 where 1 is negative and 5 is
positive returning only the number”
DATA STREAMING PLATFORM
B
Brian
3 days ago
Amazing! Game Changer!
©2022, Imply
©2021, imply
18
Introduction to Apache Druid
©2022, Imply 19
Applications
Analytics Applications
Druid is built for
the intersection
of analytics and
applications.
Apache
Druid
©2022, Imply 20
Apache Druid is a real-time analytics database
Sub-second queries at massive scale
Interactive analytics on TB-PBs of data
High concurrency at low cost
1000s QPS via highly efficient distributed query engine
Real-time and historical insights
True stream ingestion for Kafka and Kinesis
Plus, non-stop reliability with automated fault
tolerance and continuous backup
1
2
3
For analytics applications that require:
(Example of a Druid-powered UX)
©2022, Imply
Real-Time Ingestion
Ingestion Task 0
Ingestion Task 1
Ingestion Task M
…
Topic
Partition 0
Partition 1
Partition 2
Partition N
21
Druid’s stream ingestion scales right with Kafka
Producer
Producer
Producer
…
…
Query
Historical
Historical
Historical
©2022, Imply 22
22
1 2 3 4 5
Apache Druid is a database built for data in motion
Continuous backup
Guaranteed consistency
Real-time insights with
flexible historical queries
Event-based ingestion
High EPS scalability
Schema auto-detection
No connector needed
to druid
Real-time decisioning:
External-facing analytics:
Operational visibility at scale: Rapid data exploration:
For use cases where instant query response powers
automated rules engines and ML frameworks, including
real-time decisions and recommendations
For use cases that require instant response on interactive,
ad-hoc queries at scale on high-dimensional data such as
root cause diagnostics, ML training, and investigation.
For use cases where analytics are being delivered to
external stakeholders as a product or as a value add with
strict SLAs for performance under load and resiliency
For use cases that require real-time insights on big,
fast-moving event streams like observability, product
analytics, clickstream, IoT, and fraud detection
A high-performance, real-time analytics database
Supply chain Logistics Healthtech
Adtech Fintech Gaming Entertainment Retail
eCommerce
Operational visibility at scale
External-facing analytics
Rapid data exploration
Real-time decisioning
Just a few examples of the 1000s of Druid users
©2022, Imply 25
Selecting the right tool for the job
Apache Druid complements your analytics stack
● High QPS application/API usage
● Strict latency requirements
● Less elastic
● Ad-hoc, low concurrency usage
● Loose latency requirements
● Highly elastic
Ideal
workload
● Scatter-gather and shuffling engines
● Leverage broad array of indexes
● Guaranteed data caching
● Shuffling engines
● Focus on sequential data access
● Opportunistic data caching
Technical
properties
Snowflake/BigQuery/Trino
Cloud Data Warehouses + Query Engines
Apache Druid
Real-Time Analytics Database
Trusted technology with an awesome community
Companies using Druid
Active Contributors
YoY Increase in Community Activity
Community Members
1,900+ 150%
14,000+ 600+
©2022, Imply 27
And many more!
The data teams at leading organizations choose Druid
Advertising Entertainment Industrial Financial
Gaming Technology/Platform Technology/SaaS Security
©2022, Imply
©2021, imply
28
Case Studies
10+
GB /hr
3X
faster
queries
99.9%
availability
Ad Serving
Event Collector
METADATA
USER ACTIONS
Reddit Ad Serving Pipeline
30
Phone Service Amazon
Kinesis
JSON events
Batch processing
Viz tools
User
Lyft Analytics Pipeline
400+
queries /
minute
10X
faster
queries
<500ms
Latency on
99% of
queries
©2022, Imply 31
Summary: Real-time use cases for Kafka-Flink-Druid
ALERTING MONITORING DASHBOARDS EXPLORATION DECISIONING
State or stateless
triggered actions
Continuous
tracking of KPIs
User-facing
operational visibility
Ad-hoc rapid
data exploration
On high throughput Kafka streams
API-triggered
automated workflows
if X, then Y
©2022, Imply 32
What makes Kafka-Flink-Druid so popular?
Stream-Native
All 3 are designed
natively for streaming
data, supporting
exactly-once semantics
and event-by-event
processing.
Massively Scalable Fault Tolerant
All 3 can handle massive
event throughput into
the millions of events
per second across
delivery, processing, and
analytics.
All 3 work in tandem
for mission-critical use
cases with guaranteed
consistency, data and
node replication, and
data durability.
Complementary
All 3 provide distinct
capabilities that
together serve the
full-breadth of
real-time application
use cases.
©2022, Imply
©2021, imply
33
Cloud Services for KFD
"When used in combination, Apache Flink & Apache Kafka can enable data reusability and avoid redundant
downstream processing. The delivery of Flink & Kafka as fully managed services delivers stream processing
without the complexities of infrastructure management, enabling teams to focus on building real-time streaming
applications & pipelines that differentiate the business."
Enterprise-grade security
Secure stream processing with built-in identity and access
management, RBAC, and audit logs
Stream governance
Enforce data policies and avoid metadata duplication
leveraging native integration with Stream Governance
Monitoring
Ensure the health and uptime of your Flink queries in the
Confluent UI or via 3rd party monitoring services
Connectors
Ensure the health and uptime of your Flink queries in the
Confluent UI or via 3rd party monitoring services
Monitoring Connectors
Enterprise-grade
Security
Stream
Governance
Confluent Cloud: Unified platform for Kafka and Flink seamlessly integrated
©2022, Imply 35
Imply Polaris
The Cloud Database Service
for Apache Druid
Most
Affordable
Most
Secure
Best Time
to Value
And for OS Druid Users
©2022, Imply
©2022, imply
Thank you
36

More Related Content

Similar to A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid

Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformmatteo mazzeri
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed ServiceCloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed ServiceVMware Tanzu
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionDaniel Zivkovic
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastDatabricks
 
A Real-Time Version of the Truth
 A Real-Time Version of the Truth A Real-Time Version of the Truth
A Real-Time Version of the TruthEric Kavanagh
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsImply
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DayZivaro Inc
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkKostas Tzoumas
 
Denver Big Data Analytics Day
Denver Big Data Analytics DayDenver Big Data Analytics Day
Denver Big Data Analytics DayZivaro Inc
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunk
 
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Implyconfluent
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 

Similar to A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid (20)

Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed ServiceCloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream Breakout
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and Fast
 
A Real-Time Version of the Truth
 A Real-Time Version of the Truth A Real-Time Version of the Truth
A Real-Time Version of the Truth
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
 
Splunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech DaySplunk Enterprise 6.3 - Splunk Tech Day
Splunk Enterprise 6.3 - Splunk Tech Day
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
Denver Big Data Analytics Day
Denver Big Data Analytics DayDenver Big Data Analytics Day
Denver Big Data Analytics Day
 
SplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT BreakoutSplunkLive! London - Splunk App for Stream & MINT Breakout
SplunkLive! London - Splunk App for Stream & MINT Breakout
 
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid

  • 1. ©2022, Imply ©2022, imply The Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid 1
  • 2. ©2022, Imply Speakers 2 Gian Merlino Co-founder and CTO of Imply PMC Chair of Apache Druid Kai Wähner Field CTO of Confluent
  • 3. ©2022, Imply 3 Agenda 1. Overview of real-time applications 2. Kafka-Flink-Druid data architecture 3. Introduction to Apache Flink+use cases 4. Introduction to Apache Druid+use cases 5. Kafka-Flink-Druid case studies 6. Confluent and Imply integration 7. Q&A
  • 4. ©2022, Imply ©2021, imply 4 Data applications are going real-time
  • 5. ©2022, Imply 5 REAL-TIME APPLICATION ● >5 million events per second ● Interactive queries ● Customer-facing ● 250 queries per second ● Complex analytics ● Operational visibility
  • 6. ©2022, Imply 6 REAL-TIME APPLICATION ● Real-time monitoring ● Real-time alerting ● Real-time decisioning ● Interactive queries ● High dimensionality
  • 7. ©2022, Imply 7 REAL-TIME APPLICATION ● Real-time decisioning ● API-driven interactive queries ● Customer-facing ● 25 million users ● <100 ms query latency ● 100+ QPS
  • 8. ©2022, Imply 8 Building real-time apps with batch workflows doesn’t work Waiting… Data Collection Data Processing Data Ingestion Data Analysis Data Presentation 1 3 5 2 4 Waiting… Waiting… Waiting… Waiting… Latency measured in hours-to-days
  • 9. ©2022, Imply 9 Open source real-time data architecture Data Sources Streaming Pipeline Events/ Messages Stream Processing Real-Time Analytics Monitor/Alerting Analytics Visualization/ Decisioning Real-Time Applications Historical Data / Context + Enrichment / Transformation
  • 10. ©2022, Imply 10 timestamp sensor_id temperature 2023-07-10T10:00:00 SensorA 22.5 2023-07-10T10:01:00 SensorB 18.2 2023-07-10T10:02:00 SensorC 65.5 timestamp sensor_id location 2023-07-10T10:00:00 SensorA Room 101 2023-07-10T10:01:00 SensorB Room 101 2023-07-10T10:02:00 SensorC Room 101 Source A common real-time KFD pattern timestamp sensor_id location tempe rature 2023-07-10T10:00:00 SensorA Room 101 22.5 2023-07-10T10:01:00 SensorB Room 101 18.2 2023-07-10T10:02:00 SensorC Room 101 65.5 65.5 Current
  • 12. Flink growth has mirrored the growth of Kafka, the de facto standard for streaming data >75% of the Fortune 500 estimated to be using Kafka >100,000+ orgs using Kafka >41,000 Kafka meetup attendees >750 Kafka Improvement Proposals >12,000 Jiras for Apache Kafka 0 50,000 100,000 150,000 2020 2021 2022 2016 2017 2018 Flink Kafka Two Apache Projects, Born a Few Years Apart Monthly Unique Users
  • 13. Developers choose Flink because of its performance and rich feature set Scalability and Performance Fault Tolerance Flink is a top 5 Apache project and boasts a robust developer community Unified Processing Flink is capable of supporting stream processing workloads at tremendous scale Language Flexibility Flink's fault tolerance mechanisms ensure it can handle failures effectively and provide high availability Flink supports Java, Python, & SQL, enabling developers to work in their language of choice Flink supports stream processing, batch processing, and ad-hoc analytics through one technology
  • 14. Flink supports unified stream and batch processing ● Entire pipeline must always be running ● Execution proceeds in stages, running as needed ● Input must be processed as it arrives ● Input may be pre-sorted by time and key ● Results are reported as they become ready ● Results are reported at the end of the job ● Failure recovery resumes from a recent snapshot ● Failure recovery does a reset and full restart ● Flink guarantees effectively exactly-once results despite out-of-order data and restarts due to failures, etc. ● Effectively exactly-once guarantees are more straightforward
  • 15. Effortlessly filter, join, and enrich your data streams with Apache Flink Real-time processing Power low-latency applications and pipelines that react to real-time events and provide timely insights Data reusability Share consistent and reusable data streams widely with downstream applications and systems Data enrichment Curate, filter, and augment data on-the-fly with additional context to improve completeness, accuracy, & compliance Efficiency Improve resource utilization and cost-effectiveness by avoiding redundant processing across silos “With Confluent’s fully managed Flink offering, we can access, aggregate, and enrich data from IoT sensors, smart cameras, and Wi-Fi analytics, to swiftly take action on potential threats in real time, such as intrusion detection. This enables us to process sensor data as soon as the events occur, allowing for faster detection and response to security incidents without any added operational burden.”
  • 16. Process data streams in-flight to maximize actionability, fidelity, and portability Blob storage 3rd party app Databases Data Warehouse Database SaaS app Low latency apps and data pipelines Consistent, reusable data products Optimized resource utilization
  • 17. Enrich real-time data streams with Generative AI directly from Flink SQL INSERT INTO enriched_reviews SELECT id , review , invoke_openai(prompt,review) as score FROM product_reviews ; K N Kate 4 hours ago This was the worst decision ever. Nikola 1 day ago Not bad. Could have been cheaper. K N B Kate ★★★★★ 4 hours ago This was the worst decision ever. Nikola ★★★★★ 1 day ago Not bad. Could have been cheaper. Brian ★★★★★ 3 days ago Amazing! Game Changer! The Prompt “Score the following text on a scale of 1 and 5 where 1 is negative and 5 is positive returning only the number” DATA STREAMING PLATFORM B Brian 3 days ago Amazing! Game Changer!
  • 19. ©2022, Imply 19 Applications Analytics Applications Druid is built for the intersection of analytics and applications. Apache Druid
  • 20. ©2022, Imply 20 Apache Druid is a real-time analytics database Sub-second queries at massive scale Interactive analytics on TB-PBs of data High concurrency at low cost 1000s QPS via highly efficient distributed query engine Real-time and historical insights True stream ingestion for Kafka and Kinesis Plus, non-stop reliability with automated fault tolerance and continuous backup 1 2 3 For analytics applications that require: (Example of a Druid-powered UX)
  • 21. ©2022, Imply Real-Time Ingestion Ingestion Task 0 Ingestion Task 1 Ingestion Task M … Topic Partition 0 Partition 1 Partition 2 Partition N 21 Druid’s stream ingestion scales right with Kafka Producer Producer Producer … … Query Historical Historical Historical
  • 22. ©2022, Imply 22 22 1 2 3 4 5 Apache Druid is a database built for data in motion Continuous backup Guaranteed consistency Real-time insights with flexible historical queries Event-based ingestion High EPS scalability Schema auto-detection No connector needed to druid
  • 23. Real-time decisioning: External-facing analytics: Operational visibility at scale: Rapid data exploration: For use cases where instant query response powers automated rules engines and ML frameworks, including real-time decisions and recommendations For use cases that require instant response on interactive, ad-hoc queries at scale on high-dimensional data such as root cause diagnostics, ML training, and investigation. For use cases where analytics are being delivered to external stakeholders as a product or as a value add with strict SLAs for performance under load and resiliency For use cases that require real-time insights on big, fast-moving event streams like observability, product analytics, clickstream, IoT, and fraud detection A high-performance, real-time analytics database Supply chain Logistics Healthtech Adtech Fintech Gaming Entertainment Retail eCommerce Operational visibility at scale External-facing analytics Rapid data exploration Real-time decisioning
  • 24. Just a few examples of the 1000s of Druid users
  • 25. ©2022, Imply 25 Selecting the right tool for the job Apache Druid complements your analytics stack ● High QPS application/API usage ● Strict latency requirements ● Less elastic ● Ad-hoc, low concurrency usage ● Loose latency requirements ● Highly elastic Ideal workload ● Scatter-gather and shuffling engines ● Leverage broad array of indexes ● Guaranteed data caching ● Shuffling engines ● Focus on sequential data access ● Opportunistic data caching Technical properties Snowflake/BigQuery/Trino Cloud Data Warehouses + Query Engines Apache Druid Real-Time Analytics Database
  • 26. Trusted technology with an awesome community Companies using Druid Active Contributors YoY Increase in Community Activity Community Members 1,900+ 150% 14,000+ 600+
  • 27. ©2022, Imply 27 And many more! The data teams at leading organizations choose Druid Advertising Entertainment Industrial Financial Gaming Technology/Platform Technology/SaaS Security
  • 29. 10+ GB /hr 3X faster queries 99.9% availability Ad Serving Event Collector METADATA USER ACTIONS Reddit Ad Serving Pipeline
  • 30. 30 Phone Service Amazon Kinesis JSON events Batch processing Viz tools User Lyft Analytics Pipeline 400+ queries / minute 10X faster queries <500ms Latency on 99% of queries
  • 31. ©2022, Imply 31 Summary: Real-time use cases for Kafka-Flink-Druid ALERTING MONITORING DASHBOARDS EXPLORATION DECISIONING State or stateless triggered actions Continuous tracking of KPIs User-facing operational visibility Ad-hoc rapid data exploration On high throughput Kafka streams API-triggered automated workflows if X, then Y
  • 32. ©2022, Imply 32 What makes Kafka-Flink-Druid so popular? Stream-Native All 3 are designed natively for streaming data, supporting exactly-once semantics and event-by-event processing. Massively Scalable Fault Tolerant All 3 can handle massive event throughput into the millions of events per second across delivery, processing, and analytics. All 3 work in tandem for mission-critical use cases with guaranteed consistency, data and node replication, and data durability. Complementary All 3 provide distinct capabilities that together serve the full-breadth of real-time application use cases.
  • 34. "When used in combination, Apache Flink & Apache Kafka can enable data reusability and avoid redundant downstream processing. The delivery of Flink & Kafka as fully managed services delivers stream processing without the complexities of infrastructure management, enabling teams to focus on building real-time streaming applications & pipelines that differentiate the business." Enterprise-grade security Secure stream processing with built-in identity and access management, RBAC, and audit logs Stream governance Enforce data policies and avoid metadata duplication leveraging native integration with Stream Governance Monitoring Ensure the health and uptime of your Flink queries in the Confluent UI or via 3rd party monitoring services Connectors Ensure the health and uptime of your Flink queries in the Confluent UI or via 3rd party monitoring services Monitoring Connectors Enterprise-grade Security Stream Governance Confluent Cloud: Unified platform for Kafka and Flink seamlessly integrated
  • 35. ©2022, Imply 35 Imply Polaris The Cloud Database Service for Apache Druid Most Affordable Most Secure Best Time to Value And for OS Druid Users