The saying goes that there are only two hard things in Computer Science: cache invalidation, and naming things. Well, turns out the first one is solved actually ;)
Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture. You will learn how to
- Implement a low-latency data pipeline for cache updates based on Debezium, Apache Kafka, and Infinispan
- Create denormalized views of your data using Kafka Streams and make them accessible via plain key look-ups from a cache cluster close by
- Propagate updates between cache clusters using cross-site replication
We'll also touch on some advanced concepts, such as detecting and rejecting writes to the system of record which are derived from outdated cached state, and show in a demo how all the pieces come together, of course connected via Apache Kafka.
4. #Debezium @gunnarmorling
● Open source software engineer at Red Hat
○ Debezium
○ Quarkus
● Spec Lead for Bean Validation 2.0
● kcctl, ModiTect, MapStruct
● Java Champion
● @gunnarmorling
Gunnar Morling
11. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
5, Peter
5, Peter
App 3
Data
Put
4, Mike
Remove
5 Peter
Get
2, Null
Infinispan Deployment
Local Cache
13. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
1, Maria
2, Jenny
App 3
Data
1, Maria
2, Jenny
Put
3, Juan
3, Juan
Infinispan Deployment
Replicated Cache
14. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
3, Juan
1, Maria
2, Jenny
3, Juan
App 3
Data
1, Maria
2, Jenny
3, Juan
Infinispan Deployment
Replicated Cache
16. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria 2, Jenny
App 3
Data
3, Juan
Get 2
Jenny
Get 2
Jenny
Infinispan Deployment
Distributed Cache (One Owner)
18. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
2, Jenny
3, Juan
App 3
Data
1, Maria
3, Juan
Infinispan Deployment
Distributed Cache (Two Owners)
19. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
2, Jenny
3, Juan
App 3
Data
1, Maria
3, Juan
Put 4 Will Put 4 Will
Infinispan Deployment
Distributed Cache (Two Owners)
20. #Debezium @gunnarmorling
App 1
Data
App 2
Data
1, Maria
2, Jenny
4, Will
2, Jenny
3, Juan
4, Will
App 3
Data
1, Maria
3, Juan
Infinispan Deployment
Distributed Cache (Two Owners)
24. #Debezium @gunnarmorling
Infinispan Cross-Site Replication
AWS (LON)
GCP (NYC)
Load Balancer
APP
APP
Service
APP
APP
Service
Shared
State
Shared
State
Shared
State
Shared
State
Data
Data
Data NYC
Data LON
RELAY2
27. #Debezium @gunnarmorling
Debezium in a Nutshell
Open-Source Change Data Capture
● A CDC Platform
○ Based on transaction logs
○ Snapshotting, filtering, etc.
○ Outbox support
○ Web-based UI
● Fully open-source, very active
community
● Large production deployments
28. #Debezium @gunnarmorling
Query-based Log-based
All data changes are
captured -
No polling delay or
overhead -
Transparent to writing
applications and models -
Can capture deletes and
old record state -
Simple
Installation/Configuration -
Debezium
Log-based vs. Query-based CDC
33. #Debezium @gunnarmorling
● Can’t update filter list
● Long-running snapshots can’t be paused/resumed
● Can’t stream changes until snapshot completed
● Can’t re-snapshot selected tables
��
Detour: What’s New in Debezium?
Incremental Snapshotting
34. #Debezium @gunnarmorling
Incremental Snapshotting
● “DBLog: A Watermark Based
Change-Data-Capture
Framework”, by Andreas Andreakis
and Ioannis Papapanagiotou
● Key idea: interleave snapshot events
and events from TX log
https://arxiv.org/pdf/2010.12597v1.pdf
Detour: What’s New in Debezium?
36. #Debezium @gunnarmorling
Support for pg_logical_emit_message()
Detour: What’s New in Debezium?
● Directly writing arbitrary messages to the WAL
● No need for an outbox table
40. #Debezium @gunnarmorling
● Fast start-up, low memory consumption
● Developer joy
● Imperative and Reactive
● Best-of-breed libraries
● Run via HotSpot and GraalVM native binaries
Quarkus - Supersonic Subatomic Java
A Stack for Building Cloud-native Apps