7. • Modelled with DDD
• But... how to store them?
• Normalise to many tables?
• Put all in single blob key-value?
• Split in chunks?
8. • Strong consistency - at storage level
• Single master is SPoF and contention pain
• ..master-master is complex and fragile
• Locks to linearise updates
9. • Unrelated entities still share (indices)
• Master single write + replicas for read
• ... poor man's CQRS
• Caches
• ... and how to invalidate them?
• Overhead of (de)serialisation
10. • For logging, audits, read-only views...
• CDC - on top of existing models
• How to guarantee reliable delivery?
• ...out of sequence?
11. • No locks - no integrity
• ... but locks kill performance
• (De)serialisation overhead
• ORM hell, toxic magic of lazy loading
• Cache consistency hell
• Downstream changes - ugly CDC
12. • DDD aggregate roots
• in-memory when active
• recovered on demand
• Guaranteed durability and integrity
• Immediate downstream of updates
13. • 1973 - Actor model
• 2007 - Pat Helland (Amazon)
Life beyond
Distributed Transactions:
an Apostate’s Opinion
• 2009 - Akka by TypeSafe/LightBend
• 2023+ used in gaming, fintech, logistics...
14. • Actor guards the aggregate root
• strongly consistent
• transactional boundary
• Message box
• updates one-at-a time
• implemented without locking
16. • 1 Actor receives one message (one by one - from mailbox)
• 2 Message is validated against state
• 3 Valid message produces event
• 4 Event is written to durable storage
• 5 Event is applied to state
• 6 Side effects are triggered
17. • Sequence of immutable records
• Primary source of truth
• Append-only non-blocking writes
• SSTable-based storages:
Cassandra/Scylla, AWS Dynamo,
Google BigTable, Azure Cosmos
18. • Actors get deactivated...
• ...and then need to resurrect
• State recovered from the event journal
• Optionally, snapshots every N events
• ... for high-traffic and/or large states
19. • Exact copy of the state - wherever you want it
• ...CQRS - as it is meant to be
• Full or partial
• Which state do you need?
• Latest - need it fast, past not relevant
• Exact - can wait, need all of the events
• Relational or quasi-relational database
• ...Postgres, MySQL, ... Clickhouse
20. • States are vastly different
• Journal performance read vs. write
• Risk of split-brain
• Complexity of custom implementation
• Data lifecycle
• Lack of experienced engineers
• No single platform covering everything
21. • State size - 100's of bytes to some MBytes
• State event flows - 1/hour to 1000/sec
• Downstream latency SLA - minutes to
milliseconds
22. • Can happen in any multi-node cluster
• Two instances of single stateful actor
- two versions of reality
• Mitigation by automatic detection/recovery
23. • Downstream latency is critical for CQRS
• ... p99 <100ms (better <10ms)
• Append-only storages do have their limits
• ...excellent at writes, poor at reads
• CDC with guarantees, single source of truth
• Solution:
• stream to indexed storage
• hybrid recovery with marker
24. • Look ma, no ORM!
• Good fit for functional programming
• (State, Message) => Seq[Event]
• (State, Event) => State
• (StateOld, StateNew, Event) =>
Seq[SideEffect]
• Projections/lenses
25. • Already optimised
• Processing in memory
• Read side separated from primary states
• Scale by adding nodes to running cluster
• size of one state must fit one process...
• ...or proceed to splitting of aggregate roots
• Reducing codec overhead (JSON -> binary)
26. • Scalable stateful high-load with guarantees
• ...fintech, logistics, gaming
• multi-billion cap companies
• Stack: Akka/Cassandra/Kafka (Java or
Scala)
• Open-source Kafka Journal (Evolution)