SlideShare a Scribd company logo
1 of 84
Download to read offline
Off-Label Data Mesh
A Prescription for Healthier Data
Adam Bellemare
Staff Technologist
Who am I?
● Staff Technologist at Confluent, Technical Strategy Group
● From Toronto, Canada
Data Mesh for Analytics
OG Data Mesh
For Analytical
Data Problems
(Batch)
3
… but also for Operations!
OG Data Mesh
For Analytical
Data Problems
(Batch)
Off-Label
Analytics &
Operations!
(Batch and
Streaming)
4
A Typical Successful Company
Operations Analytics
A Typical Successful Company
1) Brittle ETLs
2) Ad-hoc
Point-to-Point
Couplings
Hard to access
data across the
whole company!
Ad-Hoc Data Communication Emerges
Relational DB
App
Ad-Hoc Data Communication Emerges
Relational DB
Cron Job Data Lake
App
Ad-Hoc Data Communication Emerges
Relational DB
Cron Job Data Lake
App App
Ad-Hoc Data Communication Emerges
Relational DB
Cron Job Data Lake
Read-Only
Replica
App App
Ad-Hoc Data Communication Emerges
Relational DB
Cron Job Data Lake
Read-Only
Replica
App App
App
Ad-Hoc Data Communication Emerges
Relational DB
Read-Only
Replica
Cron Job
Ad-Hoc
Copy
Data Lake
App App
App
App
Ad-Hoc Data Communication Emerges
Relational DB
Read-Only
Replica
Cron Job
Ad-Hoc
Copy
Ad-Hoc
Copy of
a Copy
Data Lake
App App
App
App
App
Need reliable access
to core business data
(eg: entities and business objects)
Services require access to common business data
Orders Items
Payments
Most
Services’ data
requirements
live
somewhere
in here
We’ve done something
similar before
Microservices (Synchronous)
APIs provide dedicated business functions
Clients couple on REST APIs
SLAs, change management, responsibilities
Common platform
(Kubernetes, Docker, Service Catalog, etc.)
18
Database
Application
Services can end up looking like weird databases
getOrder(order_id)
getOrder(user_id)
getAllOrders()
getAllPendingOrders()
getAllCanadianOrders()
What if I just need the data?
Reuse Existing Strategies
APIs Serve Business Functions Data Products serve Data
Microservices Data Mesh
Reuse Existing Strategies
APIs Serve Business Functions Data Products serve Data
Couple on REST APIs Couple on the Data Contact
Microservices Data Mesh
Reuse Existing Strategies
APIs Serve Business Functions Data Products serve Data
Couple on REST APIs Couple on the Data Contact
SLAs, change management,
responsibilities
SLAs, change management,
responsibilities
Microservices Data Mesh
Reuse Existing Strategies
APIs Serve Business Functions Data Products serve Data
Couple on REST APIs Couple on the Data Contact
SLAs, change management,
responsibilities
SLAs, change management,
responsibilities
Common platform and controls Common platform and controls
Microservices Data Mesh
The Four Principles of Data Mesh
Principle 1: Domain Ownership
Objective: Data is owned by those that truly understand it
Pattern: Data belongs to the team
who understands it best
Centralized
Data Ownership
Decentralized
Data Ownership
Anti-pattern: Centralized team owns all data
Principle 2: Data as a First-Class Product
Objective: Make shared data discoverable, addressable, trustworthy, secure, so
other teams can make good use of it.
● Data is treated as a true product, not a by-product.
Domain
Data Product
Data Product, a “Microservice for the Data World”
● Data product is a node on the data mesh, situated within a domain.
● Produces—and possibly consumes—high-quality data within the mesh.
Infra
Code
Data
Creates, manipulates,
serves, etc. that data
Powers the data (e.g., storage) and the
code (e.g., run, deploy, monitor)
“Items About to Expire”
Data Product
Data and metadata,
including history
Principle 3: Self-Serve Data Platform
Provide discovery, access, and self-service compute and publish tools
Objective: Make it easy to both create and use the data products
Principle 4: Federated Governance
Objective: Standards of Interoperability, Policies, and Support
● Global standards, data product support. “Paved Roads”.
What is decided locally?
What is decided globally?
(implemented and enforced
by platform)
Domain Domain Domain
Domain
Self-Serve Data Platform
Must balance between Decentralization and Centralization. No silver bullet!
Data Fabric Data Mesh
Access Virtually
via Fabric Layer
Access Data
Directly
Self-Serve &
Governance
Responsibility
Model
Decentralized
Management
Primarily
Technical Well-Formed
Data Sources
Centralized
Management
Social &
Technical
Not Mutually
Exclusive!
Let’s take a look
at a classic example
32
30-Min
Job
Source
DB
Data
Engineer
Data Lake
App
App
Owner
1200 1230 1300 1330
…
Extract, Transform, and Load (ETL) to Data Lake
Data Ownership is Split Across Multiple Teams
33
30-Min
Job
Source
DB
Data
Engineer
App
Owner
Daily
Job
Daily
Report
The Boss
Data Scientist
Data Lake
App
How would we approach
this pattern using Data Mesh?
Think:
Data on the Inside
vs.
Data on the Outside
Data on the Inside
Relational DB
App
Data on the Outside
Relational DB
Ad-Hoc
Copy
Cron Job
Read-Only
Replica
Data Lake
App App
App
App
Ad-Hoc
Copy of
a Copy
App
Data on the Inside
38
Source
DB
App
Dev
App
Data on the Outside
Data
Lake
App
Cloud
SaaS
In Data Mesh, Ownership is Moved Left
39
30-Min
Job
Source
DB
Data
Engineer
App
Owner
Data Lake
App
App
Owner
Create a Data Product
40
Source
DB
App
App
Owner
30-Min
Job
Data Lake
Data
Product
App
Owner
Data Product
Owner
Negotiate the Social Changes
Source
DB
App
App
Owner
Data
Product
Data Product
Owner
Current Data
Product User
Prospective Data
Product User
Data
Product
Owner & Domain Schema
Format (or API) Location
Service Level (SLA) Restrictions
Data Product Metadata
Data
Product
Owner & Domain
Adam (Engineering)
Schema
<Parquet Schema>
Format (or API)
Iceberg Table
Location
S3://bucketname/…
Service Level (SLA)
Tier 2
Restrictions
Top Secret
Data Product Metadata
Data Lake
Data
Product
Owner & Domain
Adam (Engineering)
Schema
<Parquet Schema>
Format (or API)
Iceberg Table
Location
S3://bucketname/…
Service Level (SLA)
Tier 2
Restrictions
Top Secret
End-user functionality via self-service platform
Data Lake
Request Access Contact Owner
Can Again Draw on Known Practices
1st Class Language and
Framework Support
Same
Self-Service Portals & Tools Same
Code Generators
(APIs, Clients, Servers)
Same
Catalogging and Discovery Same
Microservices Data Mesh
Domain Domain Domain
Domain
Self-Serve Data Platform
Self-Service Platform - Depends on your needs
Protobuf
How to Start
47
Start Small, Keep it Simple
• Spreadsheet of data products
• Ticket system
• Focus on the data
Iterate: Review, Revise, and Improve
48
Pave the Roads
• Make it easy
• Prototype and Trial Changes
• Share Successes
Data Fabric on top of Data Products?
Domain Domain
Domain
Data Mesh Self-Serve Data Platform
Data Fabric Access Layer
(API, Governance, Permissions, Lineage)
The Role of Event Streams
50
Batch Generated Data Powers Slow Processes
Reports Analytics Artificial
Intelligence
Data Lakes, Lakehouses, Warehouses, and Marts
51
Growing Operational Data Requirements
Act on Data in real-time
(Tactical, Operational)
52
Growing Operational Data Requirements
Act on Data in real-time
(Tactical, Operational)
Decouple services
53
Growing Operational Data Requirements
Act on Data in real-time
(Tactical, Operational)
Decouple services
Combine Separate
Data Sources
54
Growing Operational Data Requirements
Act on Data in real-time
(Tactical, Operational)
Decouple services
Combine Separate
Data Sources
Remodel Data
55
Publish data to event streams
0 1 2 3 4 5 6 7
Kafka Topic
Producer
Application
Publish
56
Consumers read at their own rate
0 1 2 3 4 5 6 7
Kafka Topic
Producer
Application
Publish
Consumer
Application
Read
57
Consumers read at their own rate
0 1 2 3 4 5 6 7
Kafka Topic
Producer
Application
Publish
Consumer
Application
Read
58
Consumers can reread the data as needed
0 1 2 3 4 5 6 7
Kafka Topic
Producer
Application
Publish
Consumer
Application
Reread
Topic as
needed
Consumer
Application
59
Combine Event Streams with Data Mesh
Data Mesh
For Analytical
Data Problems
Off-Label Uses
Operations!
(Batch and
Streaming)
60
Let’s go back to our earlier example
61
Data
Product
Parquet Files to S3 (eg: Apache Iceberg Table)
62
Daily
Job
30-Min
Job
Orders
Service
DB
Daily
Order
Report
Parquet Files
Data Lake
The
Boss
Port
Too slow!
What about real-time?
Data
Product
Can use a Kafka Topic with a Schema
Port
Parquet Files
0 1 2 3 4 5 6 7
Port
Kafka Topic
63
Data
Product
Can use a Kafka Topic with a Schema
Port
Parquet Files
0 1 2 3 4 5 6 7
Port
Kafka Topic
order_id total items time
1 $19.99 [...] 186…
2 $12.99 [...] 187…
3 $24.99 [...] 188…
4 $38.99 [...] 189…
64
Data
Product
Can use a Kafka Topic with a Schema
Port
Parquet Files
0 1 2 3 4 5 6 7
Port
Kafka Topic
order_id total items time
1 $19.99 [...] 186…
2 $12.99 [...] 187…
3 $24.99 [...] 188…
4 $38.99 [...] 189…
{
"order_id": Long,
"total": Double,
"items": List[Item],
"time": Long
}
65
Data
Product
Each event represents an Order
{
"order_id": 1,
"total": $19.99,
"items": [...],
"time": 186…
}
{
"order_id": 2,
"total": $12.99,
"items": [...],
"time": 187…
}
0 1 2 3 4 5 6 7
66
Data
Product
Giving consumers a choice for data access
Port
Parquet Files
0 1 2 3 4 5 6 7
Port
Kafka Topic
67
Prospective Data
Product Users
Select the option that works best for you
Port
Parquet Files
0 1 2 3 4 5 6 7
Port
Kafka Topic 68
Batch-Computed
Analytics
Streaming
Operational App
Streaming
Analytics
Build the Table from the Stream
69
0 1 2 3 4 5 6 7
Topic Parquet-Backed
Table
Modeling Event-Driven Data Products
70
70
@AdamBellemare | developer.confluent.io
Fact Events vs. Delta Events
Deltas Facts
71
@AdamBellemare | developer.confluent.io
Order Fact Event (a DTO)
{
"order_id": 1,
"items": [ 521, 923 ],
"total": 19.99,
"timestamp": 186…
}
item_added_to_order
{
"order_id": 1,
"item_id": 521,
"quantity": 1
}
discount_code_applied
{
"order_id": 1,
"discount_code": "SAVE-20-2022",
"discount_percent": "20"
}
Facts Model State – Deltas Model Change
Delta
Events
72
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
73
(Key)
order_id
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
1 items: [100], total: $10.00
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Read
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
74
(Key)
order_id
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
1 items: [100], total: $10.00
2 items: [200], total: $20.00
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Read
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
75
(Key)
order_id
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
1 items: [100, 200], total: $30.00
2 items: [200], total: $20.00
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Read
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
76
(Key)
order_id
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
1 items: [100, 200], total: $30.00
2 items: [200], total: $20.00
3 items: [450, 451], total: $68.50
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Read
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
77
(Key)
order_id
@AdamBellemare | developer.confluent.io
Stream-Table Duality with Fact Events
Key (order_id) Value
1 Null (deleted)
2 items: [200], total: $20.00
3 items: [450, 451], total: $68.50
03:03 1 items: [100], total: $10.00
08:33 2 items: [200], total: $20.00
13:21 1 items: [100, 200], total: $30.00
13:22 3 items: [450, 451], total: $68.50
19:54 1 Null
Time Value
Read
Upsert each Record in Sequence
Current Materialized State
Event Stream Table
78
(Key)
order_id
@AdamBellemare | developer.confluent.io
Deltas Are Used in Event Sourcing
(Data on the Inside)
Customer Product Quantity
Robert
Pants 1
T-Shirts 1
Hats 15
11:37 2 Pants added to order
11:39 1 T-Shirt added to order
11:41 1 Pants removed from order
11:42 15 Hats added to order
11:42 Apply Discount Code
Time Value
Read
Apply in sequence to build state
Current Consumer State
Stream Table
79
@AdamBellemare | developer.confluent.io
Deltas Not Suitable for Building External State
(Data on the Outside)
11:37 2 Pants added to order
11:39 1 T-Shirt added to order
11:41 1 Pants removed from order
11:42 15 Hats added to order
11:42 Apply Discount Code
Time Value
Read
Duplicate and
Tightly Coupled
order Building
Logic
Independent Consumer
Services
Apply in sequence to build state
Apply in sequence
to build state
Stream
80
81
Event-Streams for First-Class Data Access
Domain
Domain
Domain
Domain
Inventory
Orders
Shipments
Finance
Operational Streams to Power Data Lakes
Data Lake
82
Event Streams Power Realtime & Batch
Processing
All Data
(current and historic)
Streaming
Operational App
Streaming
Analytics
Kafka
Connector
Kafka
Connector
Batch-Computed
Analytics
Traditional R/R
Operational App
Millisecond
end-to-end latency
Both operational and
analytical workloads!
83
0 1 2 3 4 5 6 7
Thank You
LinkedIn: AdamBellemare
Twitter: @AdamBellemare

More Related Content

Similar to Off-Label Data Mesh: A Prescription for Healthier Data

Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
HostedbyConfluent
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 

Similar to Off-Label Data Mesh: A Prescription for Healthier Data (20)

How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
Implementing a Data Mesh with Apache Kafka with Adam Bellemare | Kafka Summit...
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Off-Label Data Mesh: A Prescription for Healthier Data

  • 1. Off-Label Data Mesh A Prescription for Healthier Data Adam Bellemare Staff Technologist
  • 2. Who am I? ● Staff Technologist at Confluent, Technical Strategy Group ● From Toronto, Canada
  • 3. Data Mesh for Analytics OG Data Mesh For Analytical Data Problems (Batch) 3
  • 4. … but also for Operations! OG Data Mesh For Analytical Data Problems (Batch) Off-Label Analytics & Operations! (Batch and Streaming) 4
  • 5. A Typical Successful Company Operations Analytics
  • 6. A Typical Successful Company 1) Brittle ETLs 2) Ad-hoc Point-to-Point Couplings Hard to access data across the whole company!
  • 7. Ad-Hoc Data Communication Emerges Relational DB App
  • 8. Ad-Hoc Data Communication Emerges Relational DB Cron Job Data Lake App
  • 9. Ad-Hoc Data Communication Emerges Relational DB Cron Job Data Lake App App
  • 10. Ad-Hoc Data Communication Emerges Relational DB Cron Job Data Lake Read-Only Replica App App
  • 11. Ad-Hoc Data Communication Emerges Relational DB Cron Job Data Lake Read-Only Replica App App App
  • 12. Ad-Hoc Data Communication Emerges Relational DB Read-Only Replica Cron Job Ad-Hoc Copy Data Lake App App App App
  • 13. Ad-Hoc Data Communication Emerges Relational DB Read-Only Replica Cron Job Ad-Hoc Copy Ad-Hoc Copy of a Copy Data Lake App App App App App
  • 14. Need reliable access to core business data (eg: entities and business objects)
  • 15. Services require access to common business data Orders Items Payments Most Services’ data requirements live somewhere in here
  • 17. Microservices (Synchronous) APIs provide dedicated business functions Clients couple on REST APIs SLAs, change management, responsibilities Common platform (Kubernetes, Docker, Service Catalog, etc.)
  • 18. 18 Database Application Services can end up looking like weird databases getOrder(order_id) getOrder(user_id) getAllOrders() getAllPendingOrders() getAllCanadianOrders()
  • 19. What if I just need the data?
  • 20. Reuse Existing Strategies APIs Serve Business Functions Data Products serve Data Microservices Data Mesh
  • 21. Reuse Existing Strategies APIs Serve Business Functions Data Products serve Data Couple on REST APIs Couple on the Data Contact Microservices Data Mesh
  • 22. Reuse Existing Strategies APIs Serve Business Functions Data Products serve Data Couple on REST APIs Couple on the Data Contact SLAs, change management, responsibilities SLAs, change management, responsibilities Microservices Data Mesh
  • 23. Reuse Existing Strategies APIs Serve Business Functions Data Products serve Data Couple on REST APIs Couple on the Data Contact SLAs, change management, responsibilities SLAs, change management, responsibilities Common platform and controls Common platform and controls Microservices Data Mesh
  • 24. The Four Principles of Data Mesh
  • 25. Principle 1: Domain Ownership Objective: Data is owned by those that truly understand it Pattern: Data belongs to the team who understands it best Centralized Data Ownership Decentralized Data Ownership Anti-pattern: Centralized team owns all data
  • 26. Principle 2: Data as a First-Class Product Objective: Make shared data discoverable, addressable, trustworthy, secure, so other teams can make good use of it. ● Data is treated as a true product, not a by-product.
  • 27. Domain Data Product Data Product, a “Microservice for the Data World” ● Data product is a node on the data mesh, situated within a domain. ● Produces—and possibly consumes—high-quality data within the mesh. Infra Code Data Creates, manipulates, serves, etc. that data Powers the data (e.g., storage) and the code (e.g., run, deploy, monitor) “Items About to Expire” Data Product Data and metadata, including history
  • 28. Principle 3: Self-Serve Data Platform Provide discovery, access, and self-service compute and publish tools Objective: Make it easy to both create and use the data products
  • 29. Principle 4: Federated Governance Objective: Standards of Interoperability, Policies, and Support ● Global standards, data product support. “Paved Roads”. What is decided locally? What is decided globally? (implemented and enforced by platform) Domain Domain Domain Domain Self-Serve Data Platform Must balance between Decentralization and Centralization. No silver bullet!
  • 30. Data Fabric Data Mesh Access Virtually via Fabric Layer Access Data Directly Self-Serve & Governance Responsibility Model Decentralized Management Primarily Technical Well-Formed Data Sources Centralized Management Social & Technical Not Mutually Exclusive!
  • 31. Let’s take a look at a classic example
  • 32. 32 30-Min Job Source DB Data Engineer Data Lake App App Owner 1200 1230 1300 1330 … Extract, Transform, and Load (ETL) to Data Lake
  • 33. Data Ownership is Split Across Multiple Teams 33 30-Min Job Source DB Data Engineer App Owner Daily Job Daily Report The Boss Data Scientist Data Lake App
  • 34. How would we approach this pattern using Data Mesh?
  • 35. Think: Data on the Inside vs. Data on the Outside
  • 36. Data on the Inside Relational DB App
  • 37. Data on the Outside Relational DB Ad-Hoc Copy Cron Job Read-Only Replica Data Lake App App App App Ad-Hoc Copy of a Copy App
  • 38. Data on the Inside 38 Source DB App Dev App Data on the Outside Data Lake App Cloud SaaS
  • 39. In Data Mesh, Ownership is Moved Left 39 30-Min Job Source DB Data Engineer App Owner Data Lake App App Owner
  • 40. Create a Data Product 40 Source DB App App Owner 30-Min Job Data Lake Data Product App Owner Data Product Owner
  • 41. Negotiate the Social Changes Source DB App App Owner Data Product Data Product Owner Current Data Product User Prospective Data Product User
  • 42. Data Product Owner & Domain Schema Format (or API) Location Service Level (SLA) Restrictions Data Product Metadata
  • 43. Data Product Owner & Domain Adam (Engineering) Schema <Parquet Schema> Format (or API) Iceberg Table Location S3://bucketname/… Service Level (SLA) Tier 2 Restrictions Top Secret Data Product Metadata Data Lake
  • 44. Data Product Owner & Domain Adam (Engineering) Schema <Parquet Schema> Format (or API) Iceberg Table Location S3://bucketname/… Service Level (SLA) Tier 2 Restrictions Top Secret End-user functionality via self-service platform Data Lake Request Access Contact Owner
  • 45. Can Again Draw on Known Practices 1st Class Language and Framework Support Same Self-Service Portals & Tools Same Code Generators (APIs, Clients, Servers) Same Catalogging and Discovery Same Microservices Data Mesh
  • 46. Domain Domain Domain Domain Self-Serve Data Platform Self-Service Platform - Depends on your needs Protobuf
  • 47. How to Start 47 Start Small, Keep it Simple • Spreadsheet of data products • Ticket system • Focus on the data
  • 48. Iterate: Review, Revise, and Improve 48 Pave the Roads • Make it easy • Prototype and Trial Changes • Share Successes
  • 49. Data Fabric on top of Data Products? Domain Domain Domain Data Mesh Self-Serve Data Platform Data Fabric Access Layer (API, Governance, Permissions, Lineage)
  • 50. The Role of Event Streams 50
  • 51. Batch Generated Data Powers Slow Processes Reports Analytics Artificial Intelligence Data Lakes, Lakehouses, Warehouses, and Marts 51
  • 52. Growing Operational Data Requirements Act on Data in real-time (Tactical, Operational) 52
  • 53. Growing Operational Data Requirements Act on Data in real-time (Tactical, Operational) Decouple services 53
  • 54. Growing Operational Data Requirements Act on Data in real-time (Tactical, Operational) Decouple services Combine Separate Data Sources 54
  • 55. Growing Operational Data Requirements Act on Data in real-time (Tactical, Operational) Decouple services Combine Separate Data Sources Remodel Data 55
  • 56. Publish data to event streams 0 1 2 3 4 5 6 7 Kafka Topic Producer Application Publish 56
  • 57. Consumers read at their own rate 0 1 2 3 4 5 6 7 Kafka Topic Producer Application Publish Consumer Application Read 57
  • 58. Consumers read at their own rate 0 1 2 3 4 5 6 7 Kafka Topic Producer Application Publish Consumer Application Read 58
  • 59. Consumers can reread the data as needed 0 1 2 3 4 5 6 7 Kafka Topic Producer Application Publish Consumer Application Reread Topic as needed Consumer Application 59
  • 60. Combine Event Streams with Data Mesh Data Mesh For Analytical Data Problems Off-Label Uses Operations! (Batch and Streaming) 60
  • 61. Let’s go back to our earlier example 61
  • 62. Data Product Parquet Files to S3 (eg: Apache Iceberg Table) 62 Daily Job 30-Min Job Orders Service DB Daily Order Report Parquet Files Data Lake The Boss Port Too slow! What about real-time?
  • 63. Data Product Can use a Kafka Topic with a Schema Port Parquet Files 0 1 2 3 4 5 6 7 Port Kafka Topic 63
  • 64. Data Product Can use a Kafka Topic with a Schema Port Parquet Files 0 1 2 3 4 5 6 7 Port Kafka Topic order_id total items time 1 $19.99 [...] 186… 2 $12.99 [...] 187… 3 $24.99 [...] 188… 4 $38.99 [...] 189… 64
  • 65. Data Product Can use a Kafka Topic with a Schema Port Parquet Files 0 1 2 3 4 5 6 7 Port Kafka Topic order_id total items time 1 $19.99 [...] 186… 2 $12.99 [...] 187… 3 $24.99 [...] 188… 4 $38.99 [...] 189… { "order_id": Long, "total": Double, "items": List[Item], "time": Long } 65
  • 66. Data Product Each event represents an Order { "order_id": 1, "total": $19.99, "items": [...], "time": 186… } { "order_id": 2, "total": $12.99, "items": [...], "time": 187… } 0 1 2 3 4 5 6 7 66
  • 67. Data Product Giving consumers a choice for data access Port Parquet Files 0 1 2 3 4 5 6 7 Port Kafka Topic 67 Prospective Data Product Users
  • 68. Select the option that works best for you Port Parquet Files 0 1 2 3 4 5 6 7 Port Kafka Topic 68 Batch-Computed Analytics Streaming Operational App Streaming Analytics
  • 69. Build the Table from the Stream 69 0 1 2 3 4 5 6 7 Topic Parquet-Backed Table
  • 70. Modeling Event-Driven Data Products 70 70
  • 71. @AdamBellemare | developer.confluent.io Fact Events vs. Delta Events Deltas Facts 71
  • 72. @AdamBellemare | developer.confluent.io Order Fact Event (a DTO) { "order_id": 1, "items": [ 521, 923 ], "total": 19.99, "timestamp": 186… } item_added_to_order { "order_id": 1, "item_id": 521, "quantity": 1 } discount_code_applied { "order_id": 1, "discount_code": "SAVE-20-2022", "discount_percent": "20" } Facts Model State – Deltas Model Change Delta Events 72
  • 73. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Upsert each Record in Sequence Current Materialized State Event Stream Table 73 (Key) order_id
  • 74. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 1 items: [100], total: $10.00 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Read Upsert each Record in Sequence Current Materialized State Event Stream Table 74 (Key) order_id
  • 75. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 1 items: [100], total: $10.00 2 items: [200], total: $20.00 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Read Upsert each Record in Sequence Current Materialized State Event Stream Table 75 (Key) order_id
  • 76. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 1 items: [100, 200], total: $30.00 2 items: [200], total: $20.00 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Read Upsert each Record in Sequence Current Materialized State Event Stream Table 76 (Key) order_id
  • 77. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 1 items: [100, 200], total: $30.00 2 items: [200], total: $20.00 3 items: [450, 451], total: $68.50 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Read Upsert each Record in Sequence Current Materialized State Event Stream Table 77 (Key) order_id
  • 78. @AdamBellemare | developer.confluent.io Stream-Table Duality with Fact Events Key (order_id) Value 1 Null (deleted) 2 items: [200], total: $20.00 3 items: [450, 451], total: $68.50 03:03 1 items: [100], total: $10.00 08:33 2 items: [200], total: $20.00 13:21 1 items: [100, 200], total: $30.00 13:22 3 items: [450, 451], total: $68.50 19:54 1 Null Time Value Read Upsert each Record in Sequence Current Materialized State Event Stream Table 78 (Key) order_id
  • 79. @AdamBellemare | developer.confluent.io Deltas Are Used in Event Sourcing (Data on the Inside) Customer Product Quantity Robert Pants 1 T-Shirts 1 Hats 15 11:37 2 Pants added to order 11:39 1 T-Shirt added to order 11:41 1 Pants removed from order 11:42 15 Hats added to order 11:42 Apply Discount Code Time Value Read Apply in sequence to build state Current Consumer State Stream Table 79
  • 80. @AdamBellemare | developer.confluent.io Deltas Not Suitable for Building External State (Data on the Outside) 11:37 2 Pants added to order 11:39 1 T-Shirt added to order 11:41 1 Pants removed from order 11:42 15 Hats added to order 11:42 Apply Discount Code Time Value Read Duplicate and Tightly Coupled order Building Logic Independent Consumer Services Apply in sequence to build state Apply in sequence to build state Stream 80
  • 81. 81 Event-Streams for First-Class Data Access Domain Domain Domain Domain
  • 83. Event Streams Power Realtime & Batch Processing All Data (current and historic) Streaming Operational App Streaming Analytics Kafka Connector Kafka Connector Batch-Computed Analytics Traditional R/R Operational App Millisecond end-to-end latency Both operational and analytical workloads! 83 0 1 2 3 4 5 6 7