AWS Immersion Day
Mapfre - Confluent
Elena Molina
Partner Technical Trainer
Salvatore Alessandro
Enterprise Solutions Engineer
Introduction
09:30 - 09:45
01 Who are we?
What is Confluent cloud?
Introduction to Cloud Features
09:45 - 10:30
02 Fully Manage Connectors, KsqlDB, Cluster Linking.
Break
10:30 - 10:45
03 15 Min - Coffee Break
Hybrid Workshop Introduction
10:45 - 11:00
04 Infrastructure, architecture and use case introduction.
Hybrid Workshop Hands On - Part 1
11:00 - 12:15
05
Hands-on: Build a bridge between on prem and the cloud using
Cluster linking.
2
Agenda
Break
12:15 - 12:30
06 15 Min - Coffee Break
Hybrid Workshop Hands On - Part 2
12:30 - 13:45
07
Hands-on: Create a streaming app using KsqlDB and build a bridge
between the cloud and on prem using Cluster Linking.
Multi-regional Disaster Recovery on AWS
13:45 - 14:15
08
Multi-regional Disaster Recovery with Confluent Cluster Linking on AWS
Lunch and Networking
14:15 - 16:00
09
3
Confluent Mapfre Team
Much more than a platform
Enterprise
Account
Manager
Marcos Yanez
Enterprise
Solutions
Engineer
Salvo
Alessandro
PS &
Education
Gonzalo Garcia
Customer
Success
Manager
Asier
Fernández
Partner
Technical
Trainer
Elena Molina
01. Introduction
4
Loyalty Rewards
Curbside Pickup
Trending Now
Popular on Netflix
Top Picks for Joshua
Created by the founders of
Confluent while at LinkedIn
Apache Kafka has ushered in
the data streaming era…
>70%
of the Fortune 500
>100,000+
Organizations
>41,000
Kafka Meetup Attendees
>200
Global Meetup Groups
>750
Kafka Improvement Proposals (KIPs)
>12,000
Jiras for Apache Kafka
>32,000
Stack Overflow Questions
Real-time Trades
Ride ETA
Personalized Recommendations
The need for a cloud-native, data streaming platform
Connecting all your apps, systems and data into a central nervous system
PUTTING KAFKA IN THE CLOUD…
ISN’T JUST PUTTING KAFKA IN THE CLOUD.
Managing
infrastructure
Development
Resources
Security &
Governance
Global Availability
Trying to get here on your own with Open Source Kafka has
significant challenges…
Why Confluent is the world’s most trusted data streaming
platform
Focus & Expertise
Only company focused on data in motion:
● Founded in 2014 by the Original Creators of
Apache Kafka
● Over 80% of Kafka commits are by
Confluent employees
● Advised on thousands of real-world Kafka
deployments across a wide range of
patterns & industries
Building and supporting a world class product:
● >9 million engineering hours spent
building building our product
● We internally manage >15,000 clusters
(and counting) in Confluent Cloud
● Over 1 million cumulative hours of Kafka
expertise within Confluent support &
services
Execution at Scale
Confluent Cloud
COMPLETE EVERYWHERE
CLOUD-NATIVE
Confluent makes data streaming easy
Open Source
Real-time
Data
Integration
Stream
Processing
Enterprise
Security &
Governance
…100s more
features
Kora Engine
Multi-cloud SaaS & Private Cloud
Open Source
Apache Kafka
Kafka completely re-
architected to
be Cloud-native
A Complete,
enterprise-grade
Data-in-Motion
Platform
Fully managed
service and software,
available Everywhere
Cloud-Native
Elastic, resilient
and performant,
powered by the
Kora Engine
Kora Architecture
NETWORK
COMPUTE
AZ AZ AZ
Cells
Cells
Cells
OBJECT
STORAGE
CUSTOMERS
Multi-Cloud Networking & Routing Tier
Metadata
Durability Audits
METRICS & OBSERVABILITY
CONNECT
PROCESSING
GOVERNANCE
Data Balancing
Health Checks
Real-time
feedback
data
Other Confluent Cloud Services
GLOBAL CONTROL PLANE
Kora: the Cloud-Native Engine for Apache Kafka
Serverless
● Elastic scaling up & down from 0
to GBps
● Auto capacity mgmt, load
balancing, and upgrades
Infinite Storage
● Store data cost- effectively at any
scale without growing compute
Resilience
● Multi-AZ and multi-region
replication
● Durability self-validation
High Availability
● 99.99% SLA
● Multi-region / AZ availability across
cloud providers
● Patches deployed in Confluent
Cloud before Apache Kafka
Network Flexibility
● Public, VPC, and Private Link
● Seamlessly link across clouds
and on-prem with Cluster
Linking
Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
Complete
Go above &
beyond Kafka
with all the
essential tools for
a complete data
streaming
platform
Process
Stream
Connect
Govern
Share
Secure
Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
Everywhere
Connect your
data in real time
with a platform
that spans from
on-prem to cloud
and across clouds
Any Cloud. Any Region.
Everywhere You Want To Be
Build the right data architecture for your business
SELF-MANAGED SOFTWARE
Confluent Platform
The Enterprise Distribution of Apache Kafka
Deploy on-premises or in your private
cloud
VM
FULLY MANAGED CLOUD SERVICE
Confluent Cloud
Cloud-native Data Streaming Platform built by
the founders of Apache Kafka
Available on the leading public clouds
Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
02. Introduction to Cloud Features
24
Fully Managed Connectors
25
Confluent Loves Your Existing Systems
… 200+
connectors
Other
Systems
Other
Systems
Kafka
Connect
Confluent
Kafka
Connect
https://www.confluent.io/hub/
Using fully managed connectors is the fastest, most
efficient way to break data silos
27
Custom-built connector
● Costly to allocate resources to
design, build, test, and maintain
non-differentiated data
integration components
● Delays time-to-value, taking
up to 3-6+ engineering months
to develop
● Perpetual management and
maintenance increases tech
debt and risk of downtime
● Pre-built but requires manual
installation / config efforts to
set-up and deploy connectors
● Perpetual management and
maintenance of connectors that
leads to ongoing tech debt
● Risk of downtime and business
disruption due to connector /
Connect cluster related issues
● Streamlined configurations and
on-demand provisioning of your
connectors
● Eliminates operational
overhead and management
complexity with seamless
scaling and load balancing
● Reduced risk of downtime with
Confluent Cloud’s 99.95% SLA for
all your mission critical use cases
Accelerated time-to-value • Increased developer productivity • Reduced operational burden
Self-managed connector Fully managed connector
Fully Managed
Connectors
28
Confluent Cloud’s portfolio of 70+
fully managed connectors
enables you to boost developer
productivity, eliminate
operational burden, and
accelerate time to value on your
data in motion journey.
Eliminate operational burdens of
self-managing connectors and
reduce total cost of ownership
Operate your business in real time
by modernizing your data systems
Accelerate your entire pipeline
development process with Stream
Designer, SMTs, and data preview
Only Confluent offers 70+ expert-built, fully managed
connectors across the entire stack
29
Connector configurations Connector configurations
Connector development Connector development Connector development
Connector testing Connector testing Connector testing
Connector updates Connector updates Connector updates
Connector support Connector support Connector support
Connect cluster scaling Connect cluster scaling Connect cluster scaling
Connect worker configs Connect worker configs Connect worker configs
Connect internal topics Connect internal topics Connect internal topics
Schema registry Schema registry Schema registry
Monitoring and security Monitoring and security Monitoring and security
Load balancing Load balancing Load balancing
Connect plugin installation Connect plugin installation Connect plugin installation
Other Kafka hosted
services
Apache Kafka -
Kafka Connect
Confluent Fully
Managed Connectors
Ease of use
You Manage
Provider Managed
Connectors
Connect
Workers
*Streamlined configurations
with ability for granular
controls if needed
Provider
managed
features
Connector configurations*
Connect your entire business with just a few clicks
30
70+
fully
managed
connectors
Amazon S3
Amazon Redshift
Amazon DynamoDB
Google Cloud
Spanner
AWS Lambda
Amazon SQS
Amazon Kinesis
Azure Service Bus
Azure Event Hubs
Azure Synapse
Analytics
Azure Blob
Storage
Azure Functions Azure Data Lake
Google
BigTable
200+
pre-built
connectors
Confluent HUB
31
Easily browse connectors by:
• Source vs Sinks
• Confluent vs Partner supported
• Commercial vs Free
• Available in Confluent Cloud
confluent.io/hub
Confluent Hub -
Connector Page
32
• Source or Sink ?
• Free or Commercial ?
• Supported by Confluent or partners
• Can download plugin
• Link to documentation
• License type
• Link to source code (if open source)
confluent.io/hub
Confluent Cloud
Fully Managed Connectors
33
Easily browse connectors by:
• Source vs Sinks
• Confluent vs Partner supported
• Commercial vs Free
• Available in Confluent Cloud
Custom connectors
34
Every organization has their unique data
architecture which requires additional flexibility
Home-grown systems
and custom applications
need custom-built
connectors to break
data silos
Pre-built connectors in
the Kafka ecosystem
need additional
modifications to fit your
specific work context
Lack of managed
connector options for the
long tail of less popular
data systems and apps
Custom Connectors
36
Break any data silo without
needing to manage Kafka
Connect infrastructure by
bringing your own connector
plugins to Confluent Cloud
Quickly connect to any data system
using your own Kafka Connect plugins
without code changes
Ensure high availability & performance
using logs and metrics to monitor the
health of your connectors and workers
Eliminate operational burden of
provisioning and perpetually managing
low-level connector infrastructure
Bring your own connectors and let Confluent
provision and manage connector infrastructure
Responsible for
connectors
Connector infrastructure resources
Connect workers Connect clusters Connect logs
Worker health
metrics
Connector plugins (BYO)
Custom-built
(original) connectors
Connector management
(i.e. upgrades, patching, support)
Modified connectors
(i.e. custom SMTs)
Partner-built
connectors
Community-built
connectors
Users
Confluent Cloud
Responsible for
Connect infrastructure
Infrastructure management & support
KsqlDB
39
What is Stream Processing?
40
Connect to a driver
immediately after
rideshare request
Instant notification
upon package
delivery
Real-time notification
once a new patch is
available
Automatic alert once
fraudulent activity has
been detected
Transportation Retail Technology Banking
Customers demand immediacy in every aspect of their
lives through real-time applications
These applications require reacting to events that happen
in your business immediately
The same charge
was rapidly made
multiple times
A valuable
customer is
beginning the
checkout process
A new patch for
bug fixes was
released
Events occur everywhere across an organization
A rider searches
for an available
rideshare
It requires filtering, joining, and aggregating streams
of events into something more useful
Customer Data
Transaction Data
Payment
Data
Geo Location Data
Security
Services Data
Stream data using Confluent’s connector portfolio
and Easily build real-time data pipelines with KsqlDB
Modern, cloud-based data systems
Legacy data systems
Oracle
Database
ksqlDB
Mainframes
Applications
Cloud-native / SaaS apps
Azure Synapse
Analytics
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Source
Connectors
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Sink
Connectors
What is KsqlDB?
Why KsqlDB?
45
DB
CONNECTOR
CONNECTOR
APP
APP
DB
STREAM
PROCESSING
CONNECTOR APP
DB
2
3
4
1
With all these moving parts, stream processing
architectures can become quite complex
ksqlDB simplifies the underlying architecture to make it ea
applications
DB
APP
APP
DB
PULL
PUSH
CONNECTORS
STREAM PROCESSING
MATERIALIZED VIEWS
ksqlDB
1
2
APP
Compute Storage
CREATE TABLE activePromotions AS
SELECT rideId,
qualifyPromotion(distanceToDst) AS promotion
FROM locations
GROUP BY rideId
EMIT CHANGES
ksqlDB Kafka
Build a complete real-time application
with just a few SQL statements
Easily Build Real-
Time Applications
ksqlDB is a streaming database
purpose-built for developing real-
time applications that leverage
stream processing, enabling you to
build a complete real-time
application with just a few SQL
statements
Aggregate Joins
Filters
User-Defined
Functions
Push & Pull
Query Support
Embedded
Connectors
49
Filters
CREATE STREAM high_readings AS
SELECT sensor,
reading,
FROM readings
WHERE reading > 41
EMIT CHANGES;
Joins
CREATE STREAM enriched_readings AS
SELECT reading, area, brand_name,
FROM readings
INNER JOIN brands b
ON b.sensor = readings.sensor
EMIT CHANGES;
Aggregate CREATE TABLE avg_readings AS
SELECT sensor,
AVG(reading) AS location
FROM readings
GROUP BY sensor
EMIT CHANGES;
Cluster Linking
52
Building a seamless bridge between on-prem and cloud
has become mission-critical...
On-prem
Datacenters &
Private Clouds
Public Clouds
across AWS, Azure,
& GCP
As companies adopt
hybrid architectures,
building a bridge
between on-prem and
cloud environments
has become critical to
reliably connecting
and sharing data
across the entire
business
5
4
Cloud
Provider 1
Cloud
Provider 2
● More brittle
interconnections to
individually set up and
manage
● Complex new cloud
networking and security
considerations
● New compliance and data
sovereignty challenges
On-premises
...but hybrid cloud today is challenging
Companies need a simpler way to link their hybrid
environments and share data in real-time
Cloud
Provider 1
Cloud
Provider 2
On-premises
55
Hybrid architectures
require a highly
available, consistent,
and secure real-time
bridge between on-
prem and cloud
environments to provide
teams with real-time,
self-service access to
data wherever it resides
Accelerate the enterprise journey to cloud with Cluster
Linking
Cluster Linking accelerates the
enterprise journey to cloud by
securely, reliably, and effortlessly
creating a real-time bridge between
cloud and on-prem environments
• Bridge to cloud: Seamlessly build
hybrid & multi-cloud architectures
that are secure and reliable
without add’l systems to manage
• Cluster migrations: Facilitate
smooth migrations with no data
loss and minimal downtime
• Source-initiated links: Securely
share data between Confluent
Platform and Confluent Cloud
without opening on-prem firewalls
to the cloud
Cluster Linking enables self-service
access to data in real-time across the
business with globally connected
clusters that provide perfectly and
reliably mirrored data
• Self-service access: Provide
access to real-time data across
different regions, clouds,
environments, and teams
• Offset preservation: Keep your
data in sync with native offset
preservation without additional
tooling
• Ease of use: Create a link from one
cluster to another with a single
command or API call
ETL/Batch
process
Data
request
Data
warehouses
Data
Stewards
DBs
SaaS
DB
Apps
Data consumer
Data consumer
Data consumer
Data consumer
Slow
Provide self-service access to real-time data across
all environments
Real-time
Self-
service
Fast
...
...
...
Cluster Linking reduces TCO and
operational burdens with seamless
and cost-effective data replication
across Kafka clusters everywhere they
need to reside
• Decreased costs: Eliminate the
need to manage and maintain
additional infrastructure required
by Connect-based solutions (e.g.,
MM2)
• Simplified Architecture: Remove
architectural complexity and
technical debt
• Reduced manual processes:
Minimize operational burdens and
risks with a single, globally-
consistent connection that’s easier
to monitor and secure
From: Connect-based solutions like MirrorMaker 2
To: Built-in geo-replication with Cluster Linking
Reduce TCO and operational burdens for data replication
across Kafka clusters
Complex architecture
Offsets are not preserved
Cluster 2
Cluster 1
MirrorMaker 2
MirrorMaker 2
Simplified architecture
Globally consistent offsets
Cluster 2
Cluster 1
Effortlessly mirror from one cluster to another
Mirror Topics
byte-for-byte copy
same offsets
same configs
same name
read-only
Source
Topics
Consumer Group
Offsets
Filter by name
ACLs
Filter by name
Destination
Mirror Topics
Consumer Group
Offsets
(optional)
ACLs
(optional)
Start in Minutes
one CLI or API call
to link or mirror
Reliable and
Performant
using Kafka’s inter-
broker protocol
AK 2.4+ CP 5.4+ and CC CP 7.0+ and CC
Cluster Link
What strategies benefit from multiple clusters?
60
Hybrid Cloud & Multi-
Cloud
Stream data between your datacenter
and cloud. Each cluster serves clients in
its environment.
Team Sandboxing
Give each team its own cluster to
separate concerns. Every team controls
its own destiny.
Disaster Recovery
Create a failover cluster in a separate
location that is ready to go when
disaster strikes.
Edge Computing
Put clusters at the edge in order to
minimize latency, lower network cost,
or buffer data when network
connection is unreliable.
Global Kafka
Footprint
Stream events between continents for
a global data strategy.
Cluster Migration
To modernize a cluster, or to
move to cloud.
03. 15 Min Break
61
04. Hybrid Workshop Introduction
62
Why this workshop?
What are we trying to achieve?
Business Use Case & Technical Architecture
63
Stream Processing
Joins streams in real-time
to control the stock
Stream of shipments
that arrive
Stream of purchases from online and/or
physical stores (Example RDBMS or
Mainframe)
Real-time
Inventory
KSQL
High Level Business Use Case: Real-time Retail
Private Cloud
Deploy on-premises in your
datacenter with Confluent
Enterprise
Public Cloud
Migrate to or adopt cloud
at your own pace with fully-
managed Confluent Cloud
Hybrid Cloud
Build a persistent bridge b/n
datacenter and cloud with
Confluent Cluster Linking
Private Cloud
Industry’s Only Solution for Hybrid Kafka
Workshop Architecture
Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
Your Personalized Environment
68
On Premise x N
dc1
Confluent Cloud
dc2
dc3 dc4
HQ
Aggregate
Cluster
Workshop Architecture
One shared Confluent Cloud
Dedicated Cluster
One shared Confluent Platform
On- Prem (name: HQ -
HeadQuarter)
N x On-prem Confluent
Platform data center (name:
dcN)
Each attendee has his or her
own environment.
Let’s Begin
71
Please select one of the environments that have been
shared with you
Password:
Feb9pass123!
05. Hybrid Workshop Hands On - Part 1
73
06. 15 Min Break
74
07. Hybrid Workshop Hands On - Part 2
75
Get started for FREE
No credit card needed
$400 free credits
confluent.io/get-started
AWS Immersion Day Mapfre   -   Confluent

AWS Immersion Day Mapfre - Confluent

  • 1.
    AWS Immersion Day Mapfre- Confluent Elena Molina Partner Technical Trainer Salvatore Alessandro Enterprise Solutions Engineer
  • 2.
    Introduction 09:30 - 09:45 01Who are we? What is Confluent cloud? Introduction to Cloud Features 09:45 - 10:30 02 Fully Manage Connectors, KsqlDB, Cluster Linking. Break 10:30 - 10:45 03 15 Min - Coffee Break Hybrid Workshop Introduction 10:45 - 11:00 04 Infrastructure, architecture and use case introduction. Hybrid Workshop Hands On - Part 1 11:00 - 12:15 05 Hands-on: Build a bridge between on prem and the cloud using Cluster linking. 2 Agenda Break 12:15 - 12:30 06 15 Min - Coffee Break Hybrid Workshop Hands On - Part 2 12:30 - 13:45 07 Hands-on: Create a streaming app using KsqlDB and build a bridge between the cloud and on prem using Cluster Linking. Multi-regional Disaster Recovery on AWS 13:45 - 14:15 08 Multi-regional Disaster Recovery with Confluent Cluster Linking on AWS Lunch and Networking 14:15 - 16:00 09
  • 3.
    3 Confluent Mapfre Team Muchmore than a platform Enterprise Account Manager Marcos Yanez Enterprise Solutions Engineer Salvo Alessandro PS & Education Gonzalo Garcia Customer Success Manager Asier Fernández Partner Technical Trainer Elena Molina
  • 4.
  • 5.
    Loyalty Rewards Curbside Pickup TrendingNow Popular on Netflix Top Picks for Joshua Created by the founders of Confluent while at LinkedIn Apache Kafka has ushered in the data streaming era… >70% of the Fortune 500 >100,000+ Organizations >41,000 Kafka Meetup Attendees >200 Global Meetup Groups >750 Kafka Improvement Proposals (KIPs) >12,000 Jiras for Apache Kafka >32,000 Stack Overflow Questions Real-time Trades Ride ETA Personalized Recommendations
  • 9.
    The need fora cloud-native, data streaming platform Connecting all your apps, systems and data into a central nervous system
  • 10.
    PUTTING KAFKA INTHE CLOUD… ISN’T JUST PUTTING KAFKA IN THE CLOUD.
  • 11.
    Managing infrastructure Development Resources Security & Governance Global Availability Tryingto get here on your own with Open Source Kafka has significant challenges…
  • 12.
    Why Confluent isthe world’s most trusted data streaming platform Focus & Expertise Only company focused on data in motion: ● Founded in 2014 by the Original Creators of Apache Kafka ● Over 80% of Kafka commits are by Confluent employees ● Advised on thousands of real-world Kafka deployments across a wide range of patterns & industries Building and supporting a world class product: ● >9 million engineering hours spent building building our product ● We internally manage >15,000 clusters (and counting) in Confluent Cloud ● Over 1 million cumulative hours of Kafka expertise within Confluent support & services Execution at Scale
  • 13.
  • 14.
    Confluent makes datastreaming easy Open Source Real-time Data Integration Stream Processing Enterprise Security & Governance …100s more features Kora Engine Multi-cloud SaaS & Private Cloud Open Source Apache Kafka Kafka completely re- architected to be Cloud-native A Complete, enterprise-grade Data-in-Motion Platform Fully managed service and software, available Everywhere
  • 15.
    Cloud-Native Elastic, resilient and performant, poweredby the Kora Engine Kora Architecture NETWORK COMPUTE AZ AZ AZ Cells Cells Cells OBJECT STORAGE CUSTOMERS Multi-Cloud Networking & Routing Tier Metadata Durability Audits METRICS & OBSERVABILITY CONNECT PROCESSING GOVERNANCE Data Balancing Health Checks Real-time feedback data Other Confluent Cloud Services GLOBAL CONTROL PLANE
  • 16.
    Kora: the Cloud-NativeEngine for Apache Kafka Serverless ● Elastic scaling up & down from 0 to GBps ● Auto capacity mgmt, load balancing, and upgrades Infinite Storage ● Store data cost- effectively at any scale without growing compute Resilience ● Multi-AZ and multi-region replication ● Durability self-validation High Availability ● 99.99% SLA ● Multi-region / AZ availability across cloud providers ● Patches deployed in Confluent Cloud before Apache Kafka Network Flexibility ● Public, VPC, and Private Link ● Seamlessly link across clouds and on-prem with Cluster Linking
  • 17.
    Confluent is somuch more than Apache Kafka Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform Enterprise-grade Security RBAC | Audit Logs | Encryption | BYOK | Private Networking Stream Governance Schema Registry & Validation | Stream Lineage | Stream Catalog | Stream Sharing Complete Engagement Model Data in Motion Blueprint Business Case Justification TCO | ROI | Risk Management & Monitoring Cloud UI | Metrics API | Control Center | Health+ Flexible DevOps Automation Admin REST APIs | Terraform APIs | Confluent for K8s | Ansible Playbooks Efficient Operations at Scale Production-stage Prerequisites Partnership for Business Success Multi-language Development Non-Java Clients | REST Proxy | MQTT Proxy Stream Processing & Integration Connectors | Flink | ksqlDB | Stream Designer Unrestricted Developer Productivity High Availability 99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters Infinite Storage Infinite Storage | Tiered Storage Elastic Scalability Expand | Shrink | Self-Balancing Clusters Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds Hybrid and Multicloud Cluster Linking | Replicator Self-managed software Kubernetes | VMs | Bare Metal Fully managed cloud service AWS | Azure | GCP Committer-driven Expertise Training Partners Professional Services Enterprise Support OPERATOR DEVELOPER ARCHITECT EXECUTIVE
  • 18.
    Complete Go above & beyondKafka with all the essential tools for a complete data streaming platform Process Stream Connect Govern Share Secure
  • 19.
    Confluent is somuch more than Apache Kafka Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform Enterprise-grade Security RBAC | Audit Logs | Encryption | BYOK | Private Networking Stream Governance Schema Registry & Validation | Stream Lineage | Stream Catalog | Stream Sharing Complete Engagement Model Data in Motion Blueprint Business Case Justification TCO | ROI | Risk Management & Monitoring Cloud UI | Metrics API | Control Center | Health+ Flexible DevOps Automation Admin REST APIs | Terraform APIs | Confluent for K8s | Ansible Playbooks Efficient Operations at Scale Production-stage Prerequisites Partnership for Business Success Multi-language Development Non-Java Clients | REST Proxy | MQTT Proxy Stream Processing & Integration Connectors | Flink | ksqlDB | Stream Designer Unrestricted Developer Productivity High Availability 99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters Infinite Storage Infinite Storage | Tiered Storage Elastic Scalability Expand | Shrink | Self-Balancing Clusters Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds Hybrid and Multicloud Cluster Linking | Replicator Self-managed software Kubernetes | VMs | Bare Metal Fully managed cloud service AWS | Azure | GCP Committer-driven Expertise Training Partners Professional Services Enterprise Support OPERATOR DEVELOPER ARCHITECT EXECUTIVE
  • 20.
    Everywhere Connect your data inreal time with a platform that spans from on-prem to cloud and across clouds
  • 21.
    Any Cloud. AnyRegion. Everywhere You Want To Be
  • 22.
    Build the rightdata architecture for your business SELF-MANAGED SOFTWARE Confluent Platform The Enterprise Distribution of Apache Kafka Deploy on-premises or in your private cloud VM FULLY MANAGED CLOUD SERVICE Confluent Cloud Cloud-native Data Streaming Platform built by the founders of Apache Kafka Available on the leading public clouds
  • 23.
    Confluent is somuch more than Apache Kafka Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform Enterprise-grade Security RBAC | Audit Logs | Encryption | BYOK | Private Networking Stream Governance Schema Registry & Validation | Stream Lineage | Stream Catalog | Stream Sharing Complete Engagement Model Data in Motion Blueprint Business Case Justification TCO | ROI | Risk Management & Monitoring Cloud UI | Metrics API | Control Center | Health+ Flexible DevOps Automation Admin REST APIs | Terraform APIs | Confluent for K8s | Ansible Playbooks Efficient Operations at Scale Production-stage Prerequisites Partnership for Business Success Multi-language Development Non-Java Clients | REST Proxy | MQTT Proxy Stream Processing & Integration Connectors | Flink | ksqlDB | Stream Designer Unrestricted Developer Productivity High Availability 99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters Infinite Storage Infinite Storage | Tiered Storage Elastic Scalability Expand | Shrink | Self-Balancing Clusters Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds Hybrid and Multicloud Cluster Linking | Replicator Self-managed software Kubernetes | VMs | Bare Metal Fully managed cloud service AWS | Azure | GCP Committer-driven Expertise Training Partners Professional Services Enterprise Support OPERATOR DEVELOPER ARCHITECT EXECUTIVE
  • 24.
    02. Introduction toCloud Features 24
  • 25.
  • 26.
    Confluent Loves YourExisting Systems … 200+ connectors Other Systems Other Systems Kafka Connect Confluent Kafka Connect https://www.confluent.io/hub/
  • 27.
    Using fully managedconnectors is the fastest, most efficient way to break data silos 27 Custom-built connector ● Costly to allocate resources to design, build, test, and maintain non-differentiated data integration components ● Delays time-to-value, taking up to 3-6+ engineering months to develop ● Perpetual management and maintenance increases tech debt and risk of downtime ● Pre-built but requires manual installation / config efforts to set-up and deploy connectors ● Perpetual management and maintenance of connectors that leads to ongoing tech debt ● Risk of downtime and business disruption due to connector / Connect cluster related issues ● Streamlined configurations and on-demand provisioning of your connectors ● Eliminates operational overhead and management complexity with seamless scaling and load balancing ● Reduced risk of downtime with Confluent Cloud’s 99.95% SLA for all your mission critical use cases Accelerated time-to-value • Increased developer productivity • Reduced operational burden Self-managed connector Fully managed connector
  • 28.
    Fully Managed Connectors 28 Confluent Cloud’sportfolio of 70+ fully managed connectors enables you to boost developer productivity, eliminate operational burden, and accelerate time to value on your data in motion journey. Eliminate operational burdens of self-managing connectors and reduce total cost of ownership Operate your business in real time by modernizing your data systems Accelerate your entire pipeline development process with Stream Designer, SMTs, and data preview
  • 29.
    Only Confluent offers70+ expert-built, fully managed connectors across the entire stack 29 Connector configurations Connector configurations Connector development Connector development Connector development Connector testing Connector testing Connector testing Connector updates Connector updates Connector updates Connector support Connector support Connector support Connect cluster scaling Connect cluster scaling Connect cluster scaling Connect worker configs Connect worker configs Connect worker configs Connect internal topics Connect internal topics Connect internal topics Schema registry Schema registry Schema registry Monitoring and security Monitoring and security Monitoring and security Load balancing Load balancing Load balancing Connect plugin installation Connect plugin installation Connect plugin installation Other Kafka hosted services Apache Kafka - Kafka Connect Confluent Fully Managed Connectors Ease of use You Manage Provider Managed Connectors Connect Workers *Streamlined configurations with ability for granular controls if needed Provider managed features Connector configurations*
  • 30.
    Connect your entirebusiness with just a few clicks 30 70+ fully managed connectors Amazon S3 Amazon Redshift Amazon DynamoDB Google Cloud Spanner AWS Lambda Amazon SQS Amazon Kinesis Azure Service Bus Azure Event Hubs Azure Synapse Analytics Azure Blob Storage Azure Functions Azure Data Lake Google BigTable 200+ pre-built connectors
  • 31.
    Confluent HUB 31 Easily browseconnectors by: • Source vs Sinks • Confluent vs Partner supported • Commercial vs Free • Available in Confluent Cloud confluent.io/hub
  • 32.
    Confluent Hub - ConnectorPage 32 • Source or Sink ? • Free or Commercial ? • Supported by Confluent or partners • Can download plugin • Link to documentation • License type • Link to source code (if open source) confluent.io/hub
  • 33.
    Confluent Cloud Fully ManagedConnectors 33 Easily browse connectors by: • Source vs Sinks • Confluent vs Partner supported • Commercial vs Free • Available in Confluent Cloud
  • 34.
  • 35.
    Every organization hastheir unique data architecture which requires additional flexibility Home-grown systems and custom applications need custom-built connectors to break data silos Pre-built connectors in the Kafka ecosystem need additional modifications to fit your specific work context Lack of managed connector options for the long tail of less popular data systems and apps
  • 36.
    Custom Connectors 36 Break anydata silo without needing to manage Kafka Connect infrastructure by bringing your own connector plugins to Confluent Cloud Quickly connect to any data system using your own Kafka Connect plugins without code changes Ensure high availability & performance using logs and metrics to monitor the health of your connectors and workers Eliminate operational burden of provisioning and perpetually managing low-level connector infrastructure
  • 37.
    Bring your ownconnectors and let Confluent provision and manage connector infrastructure Responsible for connectors Connector infrastructure resources Connect workers Connect clusters Connect logs Worker health metrics Connector plugins (BYO) Custom-built (original) connectors Connector management (i.e. upgrades, patching, support) Modified connectors (i.e. custom SMTs) Partner-built connectors Community-built connectors Users Confluent Cloud Responsible for Connect infrastructure Infrastructure management & support
  • 39.
  • 40.
    What is StreamProcessing? 40
  • 41.
    Connect to adriver immediately after rideshare request Instant notification upon package delivery Real-time notification once a new patch is available Automatic alert once fraudulent activity has been detected Transportation Retail Technology Banking Customers demand immediacy in every aspect of their lives through real-time applications
  • 42.
    These applications requirereacting to events that happen in your business immediately The same charge was rapidly made multiple times A valuable customer is beginning the checkout process A new patch for bug fixes was released Events occur everywhere across an organization A rider searches for an available rideshare
  • 43.
    It requires filtering,joining, and aggregating streams of events into something more useful Customer Data Transaction Data Payment Data Geo Location Data Security Services Data
  • 44.
    Stream data usingConfluent’s connector portfolio and Easily build real-time data pipelines with KsqlDB Modern, cloud-based data systems Legacy data systems Oracle Database ksqlDB Mainframes Applications Cloud-native / SaaS apps Azure Synapse Analytics Expensive, custom-built integrations Expensive, custom-built integrations Expensive, custom-built integrations Source Connectors Expensive, custom-built integrations Expensive, custom-built integrations Sink Connectors
  • 45.
  • 46.
    DB CONNECTOR CONNECTOR APP APP DB STREAM PROCESSING CONNECTOR APP DB 2 3 4 1 With allthese moving parts, stream processing architectures can become quite complex
  • 47.
    ksqlDB simplifies theunderlying architecture to make it ea applications DB APP APP DB PULL PUSH CONNECTORS STREAM PROCESSING MATERIALIZED VIEWS ksqlDB 1 2 APP
  • 48.
    Compute Storage CREATE TABLEactivePromotions AS SELECT rideId, qualifyPromotion(distanceToDst) AS promotion FROM locations GROUP BY rideId EMIT CHANGES ksqlDB Kafka Build a complete real-time application with just a few SQL statements Easily Build Real- Time Applications ksqlDB is a streaming database purpose-built for developing real- time applications that leverage stream processing, enabling you to build a complete real-time application with just a few SQL statements Aggregate Joins Filters User-Defined Functions Push & Pull Query Support Embedded Connectors
  • 49.
    49 Filters CREATE STREAM high_readingsAS SELECT sensor, reading, FROM readings WHERE reading > 41 EMIT CHANGES;
  • 50.
    Joins CREATE STREAM enriched_readingsAS SELECT reading, area, brand_name, FROM readings INNER JOIN brands b ON b.sensor = readings.sensor EMIT CHANGES;
  • 51.
    Aggregate CREATE TABLEavg_readings AS SELECT sensor, AVG(reading) AS location FROM readings GROUP BY sensor EMIT CHANGES;
  • 52.
  • 53.
    Building a seamlessbridge between on-prem and cloud has become mission-critical... On-prem Datacenters & Private Clouds Public Clouds across AWS, Azure, & GCP As companies adopt hybrid architectures, building a bridge between on-prem and cloud environments has become critical to reliably connecting and sharing data across the entire business
  • 54.
    5 4 Cloud Provider 1 Cloud Provider 2 ●More brittle interconnections to individually set up and manage ● Complex new cloud networking and security considerations ● New compliance and data sovereignty challenges On-premises ...but hybrid cloud today is challenging
  • 55.
    Companies need asimpler way to link their hybrid environments and share data in real-time Cloud Provider 1 Cloud Provider 2 On-premises 55 Hybrid architectures require a highly available, consistent, and secure real-time bridge between on- prem and cloud environments to provide teams with real-time, self-service access to data wherever it resides
  • 56.
    Accelerate the enterprisejourney to cloud with Cluster Linking Cluster Linking accelerates the enterprise journey to cloud by securely, reliably, and effortlessly creating a real-time bridge between cloud and on-prem environments • Bridge to cloud: Seamlessly build hybrid & multi-cloud architectures that are secure and reliable without add’l systems to manage • Cluster migrations: Facilitate smooth migrations with no data loss and minimal downtime • Source-initiated links: Securely share data between Confluent Platform and Confluent Cloud without opening on-prem firewalls to the cloud
  • 57.
    Cluster Linking enablesself-service access to data in real-time across the business with globally connected clusters that provide perfectly and reliably mirrored data • Self-service access: Provide access to real-time data across different regions, clouds, environments, and teams • Offset preservation: Keep your data in sync with native offset preservation without additional tooling • Ease of use: Create a link from one cluster to another with a single command or API call ETL/Batch process Data request Data warehouses Data Stewards DBs SaaS DB Apps Data consumer Data consumer Data consumer Data consumer Slow Provide self-service access to real-time data across all environments Real-time Self- service Fast ... ... ...
  • 58.
    Cluster Linking reducesTCO and operational burdens with seamless and cost-effective data replication across Kafka clusters everywhere they need to reside • Decreased costs: Eliminate the need to manage and maintain additional infrastructure required by Connect-based solutions (e.g., MM2) • Simplified Architecture: Remove architectural complexity and technical debt • Reduced manual processes: Minimize operational burdens and risks with a single, globally- consistent connection that’s easier to monitor and secure From: Connect-based solutions like MirrorMaker 2 To: Built-in geo-replication with Cluster Linking Reduce TCO and operational burdens for data replication across Kafka clusters Complex architecture Offsets are not preserved Cluster 2 Cluster 1 MirrorMaker 2 MirrorMaker 2 Simplified architecture Globally consistent offsets Cluster 2 Cluster 1
  • 59.
    Effortlessly mirror fromone cluster to another Mirror Topics byte-for-byte copy same offsets same configs same name read-only Source Topics Consumer Group Offsets Filter by name ACLs Filter by name Destination Mirror Topics Consumer Group Offsets (optional) ACLs (optional) Start in Minutes one CLI or API call to link or mirror Reliable and Performant using Kafka’s inter- broker protocol AK 2.4+ CP 5.4+ and CC CP 7.0+ and CC Cluster Link
  • 60.
    What strategies benefitfrom multiple clusters? 60 Hybrid Cloud & Multi- Cloud Stream data between your datacenter and cloud. Each cluster serves clients in its environment. Team Sandboxing Give each team its own cluster to separate concerns. Every team controls its own destiny. Disaster Recovery Create a failover cluster in a separate location that is ready to go when disaster strikes. Edge Computing Put clusters at the edge in order to minimize latency, lower network cost, or buffer data when network connection is unreliable. Global Kafka Footprint Stream events between continents for a global data strategy. Cluster Migration To modernize a cluster, or to move to cloud.
  • 61.
    03. 15 MinBreak 61
  • 62.
    04. Hybrid WorkshopIntroduction 62
  • 63.
    Why this workshop? Whatare we trying to achieve? Business Use Case & Technical Architecture 63
  • 64.
    Stream Processing Joins streamsin real-time to control the stock Stream of shipments that arrive Stream of purchases from online and/or physical stores (Example RDBMS or Mainframe) Real-time Inventory KSQL High Level Business Use Case: Real-time Retail
  • 65.
    Private Cloud Deploy on-premisesin your datacenter with Confluent Enterprise Public Cloud Migrate to or adopt cloud at your own pace with fully- managed Confluent Cloud Hybrid Cloud Build a persistent bridge b/n datacenter and cloud with Confluent Cluster Linking Private Cloud Industry’s Only Solution for Hybrid Kafka
  • 66.
  • 67.
    Confluent is somuch more than Apache Kafka Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform Enterprise-grade Security RBAC | Audit Logs | Encryption | BYOK | Private Networking Stream Governance Schema Registry & Validation | Stream Lineage | Stream Catalog | Stream Sharing Complete Engagement Model Data in Motion Blueprint Business Case Justification TCO | ROI | Risk Management & Monitoring Cloud UI | Metrics API | Control Center | Health+ Flexible DevOps Automation Admin REST APIs | Terraform APIs | Confluent for K8s | Ansible Playbooks Efficient Operations at Scale Production-stage Prerequisites Partnership for Business Success Multi-language Development Non-Java Clients | REST Proxy | MQTT Proxy Stream Processing & Integration Connectors | Flink | ksqlDB | Stream Designer Unrestricted Developer Productivity High Availability 99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters Infinite Storage Infinite Storage | Tiered Storage Elastic Scalability Expand | Shrink | Self-Balancing Clusters Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds Hybrid and Multicloud Cluster Linking | Replicator Self-managed software Kubernetes | VMs | Bare Metal Fully managed cloud service AWS | Azure | GCP Committer-driven Expertise Training Partners Professional Services Enterprise Support OPERATOR DEVELOPER ARCHITECT EXECUTIVE
  • 68.
  • 69.
    On Premise xN dc1 Confluent Cloud dc2 dc3 dc4 HQ Aggregate Cluster
  • 70.
    Workshop Architecture One sharedConfluent Cloud Dedicated Cluster One shared Confluent Platform On- Prem (name: HQ - HeadQuarter) N x On-prem Confluent Platform data center (name: dcN) Each attendee has his or her own environment.
  • 71.
  • 72.
    Please select oneof the environments that have been shared with you Password: Feb9pass123!
  • 73.
    05. Hybrid WorkshopHands On - Part 1 73
  • 74.
    06. 15 MinBreak 74
  • 75.
    07. Hybrid WorkshopHands On - Part 2 75
  • 76.
    Get started forFREE No credit card needed $400 free credits confluent.io/get-started