The document discusses how modern applications require real-time connectivity and instant reactions using data streams, as opposed to traditional batch processing with databases. It explains how Apache Kafka and stream processing with ksqlDB can act as the central nervous system to instantly connect data sources and sinks in real-time. The document also describes how Confluent Cloud provides a fully managed service for Apache Kafka deployments in public clouds.
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Confluent x imply: Build the last mile to value for data streaming applications
1. Build the Last Mile to Value for
Data Streaming Applications
Jag Dhillon, Pre-Sales Lead, Imply
Guru Sattanathan, Senior Solutions Engineer, Confluent
2. 2
DEMO - Real-time Fleet Monitoring
STATUS CURRENT
LOCATION
DRIVERS
COMMS
Call the
Driver
Fleets
Telemetry
End User Application
Arrival ETA
LOCATION
EVENTS HAZARDS
6. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Implication: Brittle, Complex Interconnections
LINE OF BUSINESS 01 LINE OF BUSINESS 02 PUBLIC CLOUD
7. Paradigm for Data in Motion: Event Streams
7
Rich front-end
customer
experiences
Real-time
events
Real-time
Event Streams
A Sale A shipment
A Trade
A Customer
Experience
Real-time
backend
operations
9. Implication: Central Nervous System for Enterprise
9
Kafka
Centralize an immutable stream of facts. Decentralize the freedom to act, adapt, and change.
10. Instantly Connect Your Data Sources & Sinks
20+
Partner Supported
(self-managed)
Data Diode
Confluent Supported
(self-managed)
90+
Growing list of fully managed
connectors in Cloud
Amazon S3 Blob storage
30+
Kinesis
Redshift
Event Hubs
Data Lake Gen 2
Cloud Dataproc
11. Process Data in Real-time
CREATE STREAM payments (user VARCHAR, amount INT)
WITH (kafka_topic = 'all_payments', value_format = 'avro');
CREDIT
SERVICE
ksqlDB
CREATE TABLE credit_scores AS
SELECT user, updateScore(p.amount) AS credit_score
FROM payments AS p
GROUP BY user
EMIT CHANGES;
RISK
SERVICE
ksqlDB
13. Deploy Anywhere
SELF-MANAGED SOFTWARE
Confluent Platform
The Enterprise Distribution of Apache Kafka
Deploy on-premises or in your private cloud
VM
FULLY MANAGED SERVICE
Confluent Cloud
Cloud-native service for Apache Kafka
Available on the leading public clouds
15. Confidential. Do not redistribute.
A complete data platform for
real-time analytics
16. Our mission to enable widespread access to
data analytics
16
History Company Customer Technology
● Founded in 2015
● Funded by Andreessen
Horowitz, Geodesic
Capital, & Khosla
Ventures
● Headquartered in CA
● Employees across
Americas, EMEA, &
APAC
● Global customer base
across many verticals
● Amazon, Apple,
Atlassian, Twitter,
Walmart, and more
● Full stack, multi-cloud
data platform
● Core engine: Apache
Druid, a high
performance real-time
analytics database
17. Our core engine was created for the world’s
most difficult data problems
17
Druid used for other
large/complex datasets
Netflix adopts Druid for
many internal data apps
Apache Druid created
Druid used to power
ad-tech data product
Druid widely adopted
in high tech
Imply formed
Adoption increases
Airbnb, Lyft, Hulu,
Paypal, etc adopted
18. Today, our customers are diverse and global
Media/Ads Communications Retail
Financial
Services/Fintech
Gaming Networking Technology Security
And many more! 18
19. 19
We power our customers’ mission
critical applications
Business
intelligence
Operational
intelligence
Supply chain Risk/fraud Digital media Clickstreams
Network performance Metrics
& APM
IoT/timeseries Cybersecurity
20. Confidential. Do not redistribute.
Architecture - before and after Imply
INGEST STORAGE PREPARE QUERY VISUALIZE
DATA IN MOTION
PREPARE QUERY VISUALIZE
Before
After
Result
Key Performance
Indicator (KPI) Values
Queries per day 2.5 million
Ingestion Data Rate 1 - 2 TB / day
Ingestion Events Rate 20B events / day
Data Retention 1 year (rolled up)
Users 2,100
Druid Cluster Size 930 cores
ANALYTICS IN MOTION
21. 21
How we work
Use Cases
Platforms
Custom
visualizations
BI tools
Dashboards
& reports
Real-time
analytics
ML/AI
Data apps
On-Prem
Azure Blob
Store
Google Cloud
Storage
AWS S3
Connect to a batch or
streaming source of raw data
Convert data to an Indexed Segmentᵀᴹ - a hyper-optimized format
Deliver consistent sub-second
queries at scale
Sources
22. 22
DEMO - Real-time Fleet Monitoring
STATUS CURRENT
LOCATION
DRIVERS
COMMS
Call the
Driver
Fleets
Telemetry
End User Application
Arrival ETA
LOCATION
EVENTS HAZARDS