What are the elements of a modern distributed application architecture? What are the fundamentals and programming patterns of event processing? What’s a data mesh? Is it the best way to propagate state across distributed systems? Discover the answers to these questions and more from distributed systems expert Maheedhar Gunturu.
2. Outline
1. Modern Application Architectures
i. Microservices
ii. Serverless
2. Fundamentals of Event Processing
i. Basic Terminology
ii. Advancements in Event Processing Platforms
iii. Connectors
iv. Benefits
3. Patterns in Event Processing
4. State Propagation in Distributed Systems
i. Data mesh architectures
2
4. 4
4
Application architectures are evolving!
EAI
EII
ETL ESB
SOA
(WS-*)
CEP
iP
aaS
Service
Mesh
Data
Mesh
2020’s
2010’s
2000’s
1990’s
1980’s
5. Modern Real-time Applications Architecture
Business logic
Purpose-built
databases
Events Events
APIs
Queues/Messages
5
5
6. 6
6
Benefits of Serverless
No servers to manage
Automatically scales
Only pay for consumption
Event-Driven Architectures
Write less code
7. 7
7
If your application is cloud-native,
large-scale, or distributed, and
doesn’t include a messaging
component, that’s probably a bug.
Tim Bray
General-purpose internet-software geek
9. What is an Event?
■ An event is a change in state, or an update that carries entropy
and has a timestamp associated with it. Events can either carry
state or can be identifiers.
Event based architectures have three key components:
● event producers
● event routers
● event consumers
9
10. What is an Event Stream?
■ A series of continuous events that represent the changing
behaviour of a system form an event stream.
Event stream processing usually combines the information
from many streams and/or data sources in real-time,
to derive actionable insights.
10
15. Global Replication
15
Data Center 3
Data Center 2
Data Center 1
● Data is replicated
Asynchronously
● Kafka uses Mirror-Maker to
replicate the data
● Pulsar has built-in cross data
center replication that is used in
production already
16. Clients
■ Client API
● Language bindings for Java, Go, Python, C++, Node.js and C#.
● Community client drivers for rust, erlang, haskell, websockets etc…
■ SQL
● Trino (Presto SQL)
■ Kafka on Pulsar
● Client wrapper for Kafka API compatibility
■ Apache Flink native integration
● Exactly once guarantees.
■ Pulsar adaptor for Apache Spark streaming
● Receives raw data from pulsar.
16
17. Pulsar Connectors
17
Support for multiple
protocols and multi-tenancy
FunctionMesh for K8
Connector Management
Preserves data schema
Integrated Observability
Processing Guarantees,
Load Balancing
18. Benefits of Event Stream Processing Platforms
■ Centralized infrastructure
■ Intermediate layer for buffering
■ Impedance mismatch between applications
■ Data export and import capabilities
■ Publish CDC (Change Data Capture) streams
■ Ability to recreate state
■ Streaming Data Transformations
■ Integrate with various applications
■ Event based custom Streaming workflows
18
22. 2
Event Streaming DDD Microservices Data Warehouses
Domain
Inventory
Orders
Shipments
Data Product
Data Mesh
...
What is Data Mesh?
22
22
23. Design Principles of a Data Mesh
1. Data is a product
i. Each organization owns their data end-to-end
ii. Responsible for building, operating and serving their data
iii. Publish CDC logs across bounded contexts
iv. Organization needs to resolve any issues and make it easy for data
discoverability
2. Federated data governance
i. Provides data lineage
ii. Validates data quality
iii. Data is shared securely and appropriate access controls are enforced.
iv. Auditing and reporting are enabled
3. Platform should be self-service
i. Accessible via purpose built infra tooling
23
25. Implement a Data Mesh: Cheat Sheet
25
■ Centralize data in motion
Introduce a central event streaming platform
■ Nominate data owners
Firm owners for all key datasets in the organization
Make ownership information broadly accessible
■ Data on demand
Events are either stored in messaging systems
or can be republished by data products on demand
■ Handle schema change
Owners publish schema information to the mesh
Process for schema change approval
■ Secure event streams
Access to event streams is permissioned by a central
body
■ Connect from any database
Sink Connectors are made available for all
supported database types to ease the provisioning
of new output data ports in the mesh
■ Central user interface
Events are either stored in messaging systems
or can be republished by data products on demand
■ Discovery & registration of event streams
■ Search data schemas
■ Data lineage views
■ Request to access event streams
■ Previewing event streams