La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
AWS Immersion Day Mapfre - Confluent
1. AWS Immersion Day
Mapfre - Confluent
Elena Molina
Partner Technical Trainer
Salvatore Alessandro
Enterprise Solutions Engineer
2. Introduction
09:30 - 09:45
01 Who are we?
What is Confluent cloud?
Introduction to Cloud Features
09:45 - 10:30
02 Fully Manage Connectors, KsqlDB, Cluster Linking.
Break
10:30 - 10:45
03 15 Min - Coffee Break
Hybrid Workshop Introduction
10:45 - 11:00
04 Infrastructure, architecture and use case introduction.
Hybrid Workshop Hands On - Part 1
11:00 - 12:15
05
Hands-on: Build a bridge between on prem and the cloud using
Cluster linking.
2
Agenda
Break
12:15 - 12:30
06 15 Min - Coffee Break
Hybrid Workshop Hands On - Part 2
12:30 - 13:45
07
Hands-on: Create a streaming app using KsqlDB and build a bridge
between the cloud and on prem using Cluster Linking.
Multi-regional Disaster Recovery on AWS
13:45 - 14:15
08
Multi-regional Disaster Recovery with Confluent Cluster Linking on AWS
Lunch and Networking
14:15 - 16:00
09
3. 3
Confluent Mapfre Team
Much more than a platform
Enterprise
Account
Manager
Marcos Yanez
Enterprise
Solutions
Engineer
Salvo
Alessandro
PS &
Education
Gonzalo Garcia
Customer
Success
Manager
Asier
Fernández
Partner
Technical
Trainer
Elena Molina
5. Loyalty Rewards
Curbside Pickup
Trending Now
Popular on Netflix
Top Picks for Joshua
Created by the founders of
Confluent while at LinkedIn
Apache Kafka has ushered in
the data streaming era…
>70%
of the Fortune 500
>100,000+
Organizations
>41,000
Kafka Meetup Attendees
>200
Global Meetup Groups
>750
Kafka Improvement Proposals (KIPs)
>12,000
Jiras for Apache Kafka
>32,000
Stack Overflow Questions
Real-time Trades
Ride ETA
Personalized Recommendations
6.
7.
8.
9. The need for a cloud-native, data streaming platform
Connecting all your apps, systems and data into a central nervous system
10. PUTTING KAFKA IN THE CLOUD…
ISN’T JUST PUTTING KAFKA IN THE CLOUD.
12. Why Confluent is the world’s most trusted data streaming
platform
Focus & Expertise
Only company focused on data in motion:
● Founded in 2014 by the Original Creators of
Apache Kafka
● Over 80% of Kafka commits are by
Confluent employees
● Advised on thousands of real-world Kafka
deployments across a wide range of
patterns & industries
Building and supporting a world class product:
● >9 million engineering hours spent
building building our product
● We internally manage >15,000 clusters
(and counting) in Confluent Cloud
● Over 1 million cumulative hours of Kafka
expertise within Confluent support &
services
Execution at Scale
14. Confluent makes data streaming easy
Open Source
Real-time
Data
Integration
Stream
Processing
Enterprise
Security &
Governance
…100s more
features
Kora Engine
Multi-cloud SaaS & Private Cloud
Open Source
Apache Kafka
Kafka completely re-
architected to
be Cloud-native
A Complete,
enterprise-grade
Data-in-Motion
Platform
Fully managed
service and software,
available Everywhere
15. Cloud-Native
Elastic, resilient
and performant,
powered by the
Kora Engine
Kora Architecture
NETWORK
COMPUTE
AZ AZ AZ
Cells
Cells
Cells
OBJECT
STORAGE
CUSTOMERS
Multi-Cloud Networking & Routing Tier
Metadata
Durability Audits
METRICS & OBSERVABILITY
CONNECT
PROCESSING
GOVERNANCE
Data Balancing
Health Checks
Real-time
feedback
data
Other Confluent Cloud Services
GLOBAL CONTROL PLANE
16. Kora: the Cloud-Native Engine for Apache Kafka
Serverless
● Elastic scaling up & down from 0
to GBps
● Auto capacity mgmt, load
balancing, and upgrades
Infinite Storage
● Store data cost- effectively at any
scale without growing compute
Resilience
● Multi-AZ and multi-region
replication
● Durability self-validation
High Availability
● 99.99% SLA
● Multi-region / AZ availability across
cloud providers
● Patches deployed in Confluent
Cloud before Apache Kafka
Network Flexibility
● Public, VPC, and Private Link
● Seamlessly link across clouds
and on-prem with Cluster
Linking
17. Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
18. Complete
Go above &
beyond Kafka
with all the
essential tools for
a complete data
streaming
platform
Process
Stream
Connect
Govern
Share
Secure
19. Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
22. Build the right data architecture for your business
SELF-MANAGED SOFTWARE
Confluent Platform
The Enterprise Distribution of Apache Kafka
Deploy on-premises or in your private
cloud
VM
FULLY MANAGED CLOUD SERVICE
Confluent Cloud
Cloud-native Data Streaming Platform built by
the founders of Apache Kafka
Available on the leading public clouds
23. Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
26. Confluent Loves Your Existing Systems
… 200+
connectors
Other
Systems
Other
Systems
Kafka
Connect
Confluent
Kafka
Connect
https://www.confluent.io/hub/
27. Using fully managed connectors is the fastest, most
efficient way to break data silos
27
Custom-built connector
● Costly to allocate resources to
design, build, test, and maintain
non-differentiated data
integration components
● Delays time-to-value, taking
up to 3-6+ engineering months
to develop
● Perpetual management and
maintenance increases tech
debt and risk of downtime
● Pre-built but requires manual
installation / config efforts to
set-up and deploy connectors
● Perpetual management and
maintenance of connectors that
leads to ongoing tech debt
● Risk of downtime and business
disruption due to connector /
Connect cluster related issues
● Streamlined configurations and
on-demand provisioning of your
connectors
● Eliminates operational
overhead and management
complexity with seamless
scaling and load balancing
● Reduced risk of downtime with
Confluent Cloud’s 99.95% SLA for
all your mission critical use cases
Accelerated time-to-value • Increased developer productivity • Reduced operational burden
Self-managed connector Fully managed connector
28. Fully Managed
Connectors
28
Confluent Cloud’s portfolio of 70+
fully managed connectors
enables you to boost developer
productivity, eliminate
operational burden, and
accelerate time to value on your
data in motion journey.
Eliminate operational burdens of
self-managing connectors and
reduce total cost of ownership
Operate your business in real time
by modernizing your data systems
Accelerate your entire pipeline
development process with Stream
Designer, SMTs, and data preview
29. Only Confluent offers 70+ expert-built, fully managed
connectors across the entire stack
29
Connector configurations Connector configurations
Connector development Connector development Connector development
Connector testing Connector testing Connector testing
Connector updates Connector updates Connector updates
Connector support Connector support Connector support
Connect cluster scaling Connect cluster scaling Connect cluster scaling
Connect worker configs Connect worker configs Connect worker configs
Connect internal topics Connect internal topics Connect internal topics
Schema registry Schema registry Schema registry
Monitoring and security Monitoring and security Monitoring and security
Load balancing Load balancing Load balancing
Connect plugin installation Connect plugin installation Connect plugin installation
Other Kafka hosted
services
Apache Kafka -
Kafka Connect
Confluent Fully
Managed Connectors
Ease of use
You Manage
Provider Managed
Connectors
Connect
Workers
*Streamlined configurations
with ability for granular
controls if needed
Provider
managed
features
Connector configurations*
30. Connect your entire business with just a few clicks
30
70+
fully
managed
connectors
Amazon S3
Amazon Redshift
Amazon DynamoDB
Google Cloud
Spanner
AWS Lambda
Amazon SQS
Amazon Kinesis
Azure Service Bus
Azure Event Hubs
Azure Synapse
Analytics
Azure Blob
Storage
Azure Functions Azure Data Lake
Google
BigTable
200+
pre-built
connectors
31. Confluent HUB
31
Easily browse connectors by:
• Source vs Sinks
• Confluent vs Partner supported
• Commercial vs Free
• Available in Confluent Cloud
confluent.io/hub
32. Confluent Hub -
Connector Page
32
• Source or Sink ?
• Free or Commercial ?
• Supported by Confluent or partners
• Can download plugin
• Link to documentation
• License type
• Link to source code (if open source)
confluent.io/hub
33. Confluent Cloud
Fully Managed Connectors
33
Easily browse connectors by:
• Source vs Sinks
• Confluent vs Partner supported
• Commercial vs Free
• Available in Confluent Cloud
35. Every organization has their unique data
architecture which requires additional flexibility
Home-grown systems
and custom applications
need custom-built
connectors to break
data silos
Pre-built connectors in
the Kafka ecosystem
need additional
modifications to fit your
specific work context
Lack of managed
connector options for the
long tail of less popular
data systems and apps
36. Custom Connectors
36
Break any data silo without
needing to manage Kafka
Connect infrastructure by
bringing your own connector
plugins to Confluent Cloud
Quickly connect to any data system
using your own Kafka Connect plugins
without code changes
Ensure high availability & performance
using logs and metrics to monitor the
health of your connectors and workers
Eliminate operational burden of
provisioning and perpetually managing
low-level connector infrastructure
37. Bring your own connectors and let Confluent
provision and manage connector infrastructure
Responsible for
connectors
Connector infrastructure resources
Connect workers Connect clusters Connect logs
Worker health
metrics
Connector plugins (BYO)
Custom-built
(original) connectors
Connector management
(i.e. upgrades, patching, support)
Modified connectors
(i.e. custom SMTs)
Partner-built
connectors
Community-built
connectors
Users
Confluent Cloud
Responsible for
Connect infrastructure
Infrastructure management & support
41. Connect to a driver
immediately after
rideshare request
Instant notification
upon package
delivery
Real-time notification
once a new patch is
available
Automatic alert once
fraudulent activity has
been detected
Transportation Retail Technology Banking
Customers demand immediacy in every aspect of their
lives through real-time applications
42. These applications require reacting to events that happen
in your business immediately
The same charge
was rapidly made
multiple times
A valuable
customer is
beginning the
checkout process
A new patch for
bug fixes was
released
Events occur everywhere across an organization
A rider searches
for an available
rideshare
43. It requires filtering, joining, and aggregating streams
of events into something more useful
Customer Data
Transaction Data
Payment
Data
Geo Location Data
Security
Services Data
44. Stream data using Confluent’s connector portfolio
and Easily build real-time data pipelines with KsqlDB
Modern, cloud-based data systems
Legacy data systems
Oracle
Database
ksqlDB
Mainframes
Applications
Cloud-native / SaaS apps
Azure Synapse
Analytics
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Source
Connectors
Expensive,
custom-built
integrations
Expensive,
custom-built
integrations
Sink
Connectors
47. ksqlDB simplifies the underlying architecture to make it ea
applications
DB
APP
APP
DB
PULL
PUSH
CONNECTORS
STREAM PROCESSING
MATERIALIZED VIEWS
ksqlDB
1
2
APP
48. Compute Storage
CREATE TABLE activePromotions AS
SELECT rideId,
qualifyPromotion(distanceToDst) AS promotion
FROM locations
GROUP BY rideId
EMIT CHANGES
ksqlDB Kafka
Build a complete real-time application
with just a few SQL statements
Easily Build Real-
Time Applications
ksqlDB is a streaming database
purpose-built for developing real-
time applications that leverage
stream processing, enabling you to
build a complete real-time
application with just a few SQL
statements
Aggregate Joins
Filters
User-Defined
Functions
Push & Pull
Query Support
Embedded
Connectors
53. Building a seamless bridge between on-prem and cloud
has become mission-critical...
On-prem
Datacenters &
Private Clouds
Public Clouds
across AWS, Azure,
& GCP
As companies adopt
hybrid architectures,
building a bridge
between on-prem and
cloud environments
has become critical to
reliably connecting
and sharing data
across the entire
business
54. 5
4
Cloud
Provider 1
Cloud
Provider 2
● More brittle
interconnections to
individually set up and
manage
● Complex new cloud
networking and security
considerations
● New compliance and data
sovereignty challenges
On-premises
...but hybrid cloud today is challenging
55. Companies need a simpler way to link their hybrid
environments and share data in real-time
Cloud
Provider 1
Cloud
Provider 2
On-premises
55
Hybrid architectures
require a highly
available, consistent,
and secure real-time
bridge between on-
prem and cloud
environments to provide
teams with real-time,
self-service access to
data wherever it resides
56. Accelerate the enterprise journey to cloud with Cluster
Linking
Cluster Linking accelerates the
enterprise journey to cloud by
securely, reliably, and effortlessly
creating a real-time bridge between
cloud and on-prem environments
• Bridge to cloud: Seamlessly build
hybrid & multi-cloud architectures
that are secure and reliable
without add’l systems to manage
• Cluster migrations: Facilitate
smooth migrations with no data
loss and minimal downtime
• Source-initiated links: Securely
share data between Confluent
Platform and Confluent Cloud
without opening on-prem firewalls
to the cloud
57. Cluster Linking enables self-service
access to data in real-time across the
business with globally connected
clusters that provide perfectly and
reliably mirrored data
• Self-service access: Provide
access to real-time data across
different regions, clouds,
environments, and teams
• Offset preservation: Keep your
data in sync with native offset
preservation without additional
tooling
• Ease of use: Create a link from one
cluster to another with a single
command or API call
ETL/Batch
process
Data
request
Data
warehouses
Data
Stewards
DBs
SaaS
DB
Apps
Data consumer
Data consumer
Data consumer
Data consumer
Slow
Provide self-service access to real-time data across
all environments
Real-time
Self-
service
Fast
...
...
...
58. Cluster Linking reduces TCO and
operational burdens with seamless
and cost-effective data replication
across Kafka clusters everywhere they
need to reside
• Decreased costs: Eliminate the
need to manage and maintain
additional infrastructure required
by Connect-based solutions (e.g.,
MM2)
• Simplified Architecture: Remove
architectural complexity and
technical debt
• Reduced manual processes:
Minimize operational burdens and
risks with a single, globally-
consistent connection that’s easier
to monitor and secure
From: Connect-based solutions like MirrorMaker 2
To: Built-in geo-replication with Cluster Linking
Reduce TCO and operational burdens for data replication
across Kafka clusters
Complex architecture
Offsets are not preserved
Cluster 2
Cluster 1
MirrorMaker 2
MirrorMaker 2
Simplified architecture
Globally consistent offsets
Cluster 2
Cluster 1
59. Effortlessly mirror from one cluster to another
Mirror Topics
byte-for-byte copy
same offsets
same configs
same name
read-only
Source
Topics
Consumer Group
Offsets
Filter by name
ACLs
Filter by name
Destination
Mirror Topics
Consumer Group
Offsets
(optional)
ACLs
(optional)
Start in Minutes
one CLI or API call
to link or mirror
Reliable and
Performant
using Kafka’s inter-
broker protocol
AK 2.4+ CP 5.4+ and CC CP 7.0+ and CC
Cluster Link
60. What strategies benefit from multiple clusters?
60
Hybrid Cloud & Multi-
Cloud
Stream data between your datacenter
and cloud. Each cluster serves clients in
its environment.
Team Sandboxing
Give each team its own cluster to
separate concerns. Every team controls
its own destiny.
Disaster Recovery
Create a failover cluster in a separate
location that is ready to go when
disaster strikes.
Edge Computing
Put clusters at the edge in order to
minimize latency, lower network cost,
or buffer data when network
connection is unreliable.
Global Kafka
Footprint
Stream events between continents for
a global data strategy.
Cluster Migration
To modernize a cluster, or to
move to cloud.
63. Why this workshop?
What are we trying to achieve?
Business Use Case & Technical Architecture
63
64. Stream Processing
Joins streams in real-time
to control the stock
Stream of shipments
that arrive
Stream of purchases from online and/or
physical stores (Example RDBMS or
Mainframe)
Real-time
Inventory
KSQL
High Level Business Use Case: Real-time Retail
65. Private Cloud
Deploy on-premises in your
datacenter with Confluent
Enterprise
Public Cloud
Migrate to or adopt cloud
at your own pace with fully-
managed Confluent Cloud
Hybrid Cloud
Build a persistent bridge b/n
datacenter and cloud with
Confluent Cluster Linking
Private Cloud
Industry’s Only Solution for Hybrid Kafka
67. Confluent is so much more than Apache Kafka
Complete: Go beyond Kafka with all the essential tools for a complete data streaming platform
Enterprise-grade Security
RBAC | Audit Logs | Encryption |
BYOK | Private Networking
Stream Governance
Schema Registry & Validation |
Stream Lineage | Stream Catalog |
Stream Sharing
Complete Engagement Model
Data in Motion Blueprint
Business Case Justification
TCO | ROI | Risk
Management & Monitoring
Cloud UI | Metrics API |
Control Center | Health+
Flexible DevOps Automation
Admin REST APIs | Terraform APIs |
Confluent for K8s | Ansible Playbooks
Efficient
Operations at Scale
Production-stage
Prerequisites
Partnership for
Business Success
Multi-language Development
Non-Java Clients |
REST Proxy | MQTT Proxy
Stream Processing & Integration
Connectors | Flink | ksqlDB |
Stream Designer
Unrestricted Developer
Productivity
High Availability
99.99% SLA | Multi-AZ Clusters | Multi-Region Clusters
Infinite Storage
Infinite Storage | Tiered Storage
Elastic Scalability
Expand | Shrink | Self-Balancing Clusters
Cloud Native: The 10x Apache Kafka® service: elastic, resilient and performant, powered by the Kora Engine
Everywhere: Connect your data in real time with a platform that spans from on-prem to cloud and across clouds
Hybrid and Multicloud
Cluster Linking | Replicator
Self-managed software
Kubernetes | VMs | Bare Metal
Fully managed cloud service
AWS | Azure | GCP
Committer-driven
Expertise
Training Partners
Professional
Services
Enterprise
Support
OPERATOR
DEVELOPER ARCHITECT EXECUTIVE
69. On Premise x N
dc1
Confluent Cloud
dc2
dc3 dc4
HQ
Aggregate
Cluster
70. Workshop Architecture
One shared Confluent Cloud
Dedicated Cluster
One shared Confluent Platform
On- Prem (name: HQ -
HeadQuarter)
N x On-prem Confluent
Platform data center (name:
dcN)
Each attendee has his or her
own environment.