SlideShare a Scribd company logo
Engineering a Robust and
High performance EDA
with Redpanda
Christina Lin
2
About me
2
Christina Lin
Developer Advocate, Redpanda
SOA
WebSphere
DB2
Sybase
Oracle
MQ
J2EE
EJB
DevOps
Microservice
EIP
K8s
Agile Integration
Data
Mesh
Active MQ
Live data stack
Resilience - handle failures and scale gracefully
Elasticity – infrastructure that can scale dynamically
Decentralization - data ownership, empowering individual teams
Performance - low latency and high throughput
Autonomy – self service, define quality, and access
Nimble - efficient data movement
Distributed -distributed data processing for cloud native
Agility – quickly respond to change in data
3
Robust
Event Driven Architecture
4
Services
Microservices
Databases
IoT Devices
Applications
System B
Team A
Department C
Team D
Group E
Services
Databases
IoT Devices
Applications
System B
Team A
Department C
Team D
Group E
Microservices
Producer
Consumer
Event Driven Architecture
5
Orders
Health records
Restock Signal
CDC Event
Streaming
Table/
Materialize
view
Data Store
Payroll
Payment
Shipment Signal
Inventory
The Contracts
6
Microservices
Microservices
Databases
/ CDC
Microservices
Data
Lake/Data
warehouse
Microservices
Schema Registry
7
Producer
Data structure encoding
- Avro, Protobuf and JSON
Data structure
- {name:type}
Serialize
Download the
schema (version)
Consumer
Schema Registry
Deserialize
Value
(Binary)
Schema
ID
Key
(Binary)
Value
(Binary)
Schema Registry
8
Server-side validation
Value
(Binary)
Sche
ma ID
Key
(Binar
y)
Value
(Binary)
Schema Registry
Check if schema id is
valid
Schema Registry
Producer
• Backward
• Forward
• Full
compatibility
• None
Schema Registry
Version 1
Version 2
Version 3
Schema Registry in Redpanda
9
Service Registry
Service Registry
Restful Endpoint
Restful Endpoint
_schemas
_schemas
Schema Registry
• Assign a default value to the fields that you might remove in the future
• Do not rename an existing field—add an alias instead
• When using schema evolution, always provide a default value
• Never delete a required field
When not to use Schema registry
• You’re certain the schema won’t change in the future
• If hardware resources are limited and low latency is critical, it may impact
performance (e.g., for IoT)
• You want to serialize the data with an unsupported serialization scheme
10
Event validation & DLQ
11
DLQ
Consumer
Correction/
Remedy
Validator
DLQ
Correction/
Remedy
Validator
DLQ
Correction/
Remedy
In broker validation – how it works
12
Replicate
across clusters
customer
partition 1
Load to
cache
Validate
against
schema
Transform
Write back to
disk with DMA
Customer validated
partition 1
Example repo: https://github.com/redpanda-data/redpanda-labs/tree/main/data-transforms/to_avro
In broker validation & transformation
• Firsthand processing, quick filtering
• Simple rerouting determine on ingested data
• Masking, schema validation
• Stateless, functional processing
When not to use in broker transformation?
• When it requires external data dependencies
• Windowing, complex processing, with multiple streams of input
• When it requires to keep the state of the processes
13
14
High Performance
15
Turning the knobs
Producer
Producer
Producer
Producer
Producer
Producer
Consumer
Consumer
Consumer
Consumer
Consumer
Consumer
Consumer
16
The Broker
Infrastructure Storage – XFS,NVMe
Network bandwidth
Memory
CPU
Location (Multi-AZ)
OS
Disk I/O
read_iops/bandwidth
write_iops/bandwidth
Broker
# Brokers
# Replicas
# Partitions
Log segment size
17
Partitions
Partitions
Producer
Consumer
Consumer
Consumer
Group A
• Round Robin
• Hashing Key Partition
• Custom Partitioner
Overhead
• File handler
• Follower, heartbeat
• Large Metadata
quadratic (N2)
Idempotency
• Order guarantee in partition only
Higher latency
• Producer batch
Consumer rebalance
• RangeAssignor (SW)
• RoundRobinAssignor(SW)
• StickyAssignor(SW)
• CooperativeStickyAssignor
• Static (No Rebalance)
18
Producer
Producer
fsync
Acknowledgment
from the leader Ack=all
Ack=1
Majority of replicas
acknowledge
write_caching_default=true
flush_bytes, flush_ms
Ack=0
Doesn’t wait for
acknowledgments
and doesn’t retry
sending messages
Producer
batch.size
linger.ms
compression
19
Consumer
Consumer
fetch.min.bytes
max.poll.records
fetch.max.bytes
fetch.max.wait.ms
High Throughput
• There is no on size fits all, there are many factor when it comes to
performances.
• More partition will allow more parallel processing, hence higher throughput,
but it comes with cost.(Avoid over-partitioning or under-partitioning.)
• Experiment with acks settings, Enable write caching,
• Explore how the producer batches messages. Increasing the value
of batch.size and linger.ms can increase throughput by making the
producer add more messages into one batch
• Explore consumer fetch frequency and message size.
• Start with a baseline configuration and gradually make changes, measuring
the impact of each change on performance.
20
21
Robust for Stateful Processes
Beyond just streams of events
22
Databases
/ CDC
Microservices
Databases
/ CDC
Databases
/ CDC
Processor
Beyond just streams of events
23
Databases
/ CDC
Microservices
Databases
/ CDC
Databases
/ CDC
Processor
Limited disk space
24
Event Sourcing
S3 Rehydrate
State Snapshot
25
Microservices
Databases
/ CDC
Databases
/ CDC
Databases
/ CDC
Processor
Snapshot
Summary
■ Use schema to insure data shape for consumer
■ When designing, think about compatibility
■ Validation to ensures consumer always get the correct format.
■ In broker transform are great for simple, functions, stateless processes
■ Provision appropriate partition to your topics
■ Depends on your use case, for producer, always set the right Ack, and buffer
■ For stateful streams processing, use snapshot for fault tolerance
26
On demand example
27
Batch
Every 10 mins
CSV
CSV
Batch
pipeline
Batch Processing
Batch
pipeline
Right away!
Stream
CSV
Keep in touch!
Christina Lin
Developer Advocate
Redpanda
Christina@redpanda.com

More Related Content

Similar to Event-Driven Architecture Masterclass: Engineering a Robust, High-performance EDA

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
RedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale IntegrationRedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale Integration
prajods
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft Private Cloud
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Continuent
 
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
HostedbyConfluent
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Luan Moreno Medeiros Maciel
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
Denodo
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
Sunil Govindan
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
Sunil Govindan
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
Amazon Web Services
 
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka WayStrategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
ScyllaDB
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
Precisely
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DATAVERSITY
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Tammy Bednar
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
DATAVERSITY
 
Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
SingleStore
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
MongoDB
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 

Similar to Event-Driven Architecture Masterclass: Engineering a Robust, High-performance EDA (20)

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
RedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale IntegrationRedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale Integration
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
 
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
 
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Cu...
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)Best Practices in the Cloud for Data Management (US)
Best Practices in the Cloud for Data Management (US)
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka WayStrategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled ArchitectureDM Radio Webinar: Adopting a Streaming-Enabled Architecture
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 

More from ScyllaDB

Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
ScyllaDB
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
ScyllaDB
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
ScyllaDB
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
ScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
ScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
ScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
ScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
ScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
ScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
ScyllaDB
 

More from ScyllaDB (20)

Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 

Recently uploaded

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 

Recently uploaded (20)

Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 

Event-Driven Architecture Masterclass: Engineering a Robust, High-performance EDA

  • 1. Engineering a Robust and High performance EDA with Redpanda Christina Lin
  • 2. 2 About me 2 Christina Lin Developer Advocate, Redpanda SOA WebSphere DB2 Sybase Oracle MQ J2EE EJB DevOps Microservice EIP K8s Agile Integration Data Mesh Active MQ Live data stack Resilience - handle failures and scale gracefully Elasticity – infrastructure that can scale dynamically Decentralization - data ownership, empowering individual teams Performance - low latency and high throughput Autonomy – self service, define quality, and access Nimble - efficient data movement Distributed -distributed data processing for cloud native Agility – quickly respond to change in data
  • 4. Event Driven Architecture 4 Services Microservices Databases IoT Devices Applications System B Team A Department C Team D Group E Services Databases IoT Devices Applications System B Team A Department C Team D Group E Microservices Producer Consumer
  • 5. Event Driven Architecture 5 Orders Health records Restock Signal CDC Event Streaming Table/ Materialize view Data Store Payroll Payment Shipment Signal Inventory
  • 7. Schema Registry 7 Producer Data structure encoding - Avro, Protobuf and JSON Data structure - {name:type} Serialize Download the schema (version) Consumer Schema Registry Deserialize Value (Binary) Schema ID Key (Binary) Value (Binary)
  • 8. Schema Registry 8 Server-side validation Value (Binary) Sche ma ID Key (Binar y) Value (Binary) Schema Registry Check if schema id is valid Schema Registry Producer • Backward • Forward • Full compatibility • None Schema Registry Version 1 Version 2 Version 3
  • 9. Schema Registry in Redpanda 9 Service Registry Service Registry Restful Endpoint Restful Endpoint _schemas _schemas
  • 10. Schema Registry • Assign a default value to the fields that you might remove in the future • Do not rename an existing field—add an alias instead • When using schema evolution, always provide a default value • Never delete a required field When not to use Schema registry • You’re certain the schema won’t change in the future • If hardware resources are limited and low latency is critical, it may impact performance (e.g., for IoT) • You want to serialize the data with an unsupported serialization scheme 10
  • 11. Event validation & DLQ 11 DLQ Consumer Correction/ Remedy Validator DLQ Correction/ Remedy Validator DLQ Correction/ Remedy
  • 12. In broker validation – how it works 12 Replicate across clusters customer partition 1 Load to cache Validate against schema Transform Write back to disk with DMA Customer validated partition 1 Example repo: https://github.com/redpanda-data/redpanda-labs/tree/main/data-transforms/to_avro
  • 13. In broker validation & transformation • Firsthand processing, quick filtering • Simple rerouting determine on ingested data • Masking, schema validation • Stateless, functional processing When not to use in broker transformation? • When it requires external data dependencies • Windowing, complex processing, with multiple streams of input • When it requires to keep the state of the processes 13
  • 16. 16 The Broker Infrastructure Storage – XFS,NVMe Network bandwidth Memory CPU Location (Multi-AZ) OS Disk I/O read_iops/bandwidth write_iops/bandwidth Broker # Brokers # Replicas # Partitions Log segment size
  • 17. 17 Partitions Partitions Producer Consumer Consumer Consumer Group A • Round Robin • Hashing Key Partition • Custom Partitioner Overhead • File handler • Follower, heartbeat • Large Metadata quadratic (N2) Idempotency • Order guarantee in partition only Higher latency • Producer batch Consumer rebalance • RangeAssignor (SW) • RoundRobinAssignor(SW) • StickyAssignor(SW) • CooperativeStickyAssignor • Static (No Rebalance)
  • 18. 18 Producer Producer fsync Acknowledgment from the leader Ack=all Ack=1 Majority of replicas acknowledge write_caching_default=true flush_bytes, flush_ms Ack=0 Doesn’t wait for acknowledgments and doesn’t retry sending messages Producer batch.size linger.ms compression
  • 20. High Throughput • There is no on size fits all, there are many factor when it comes to performances. • More partition will allow more parallel processing, hence higher throughput, but it comes with cost.(Avoid over-partitioning or under-partitioning.) • Experiment with acks settings, Enable write caching, • Explore how the producer batches messages. Increasing the value of batch.size and linger.ms can increase throughput by making the producer add more messages into one batch • Explore consumer fetch frequency and message size. • Start with a baseline configuration and gradually make changes, measuring the impact of each change on performance. 20
  • 22. Beyond just streams of events 22 Databases / CDC Microservices Databases / CDC Databases / CDC Processor
  • 23. Beyond just streams of events 23 Databases / CDC Microservices Databases / CDC Databases / CDC Processor
  • 24. Limited disk space 24 Event Sourcing S3 Rehydrate
  • 25. State Snapshot 25 Microservices Databases / CDC Databases / CDC Databases / CDC Processor Snapshot
  • 26. Summary ■ Use schema to insure data shape for consumer ■ When designing, think about compatibility ■ Validation to ensures consumer always get the correct format. ■ In broker transform are great for simple, functions, stateless processes ■ Provision appropriate partition to your topics ■ Depends on your use case, for producer, always set the right Ack, and buffer ■ For stateful streams processing, use snapshot for fault tolerance 26
  • 27. On demand example 27 Batch Every 10 mins CSV CSV Batch pipeline Batch Processing Batch pipeline Right away! Stream CSV
  • 28. Keep in touch! Christina Lin Developer Advocate Redpanda Christina@redpanda.com