Backends of the Future
• Information systems for business (of any kind)
• Reflects entities of the real world
• Keep and update state
• Data aggregation and reporting
• Bookkeepers managed business data
• Bookkeepers used tables
• Tables are good for keeping state!
• Meat brains love tables
• State is stored in database in tables
• Read-change-write on each request
• Caches for performance
• Works OK: for low loads or 99,9% reads
• States get more complex
• States are updated more often
• ...concurrently!
• There are more and more states
• Stream changes of states
• DDD (since 2003)
• Bounded contexts
• Aggregate roots
• Ubiquitous language
{
"id": "1234",
"name": "Grocery List",
"items": [
{
"id": "5678",
"name": "Apples",
"completed": false
},
{
"id": "9012",
"name": "Milk",
"completed": true
},
{
"id": "3456",
"name": "Bread",
"completed": false
}
],
"created_at": "2023-04-25T10:00:00Z",
"updated_at": "2023-04-25T13:30:00Z"
}
• Modelled with DDD
• But... how to store them?
• Normalise to many tables?
• Put all in single blob key-value?
• Split in chunks?
• Strong consistency - at storage level
• Single master is SPoF and contention pain
• ..master-master is complex and fragile
• Locks to linearise updates
• Unrelated entities still share (indices)
• Master single write + replicas for read
• ... poor man's CQRS
• Caches
• ... and how to invalidate them?
• Overhead of (de)serialisation
• For logging, audits, read-only views...
• CDC - on top of existing models
• How to guarantee reliable delivery?
• ...out of sequence?
• No locks - no integrity
• ... but locks kill performance
• (De)serialisation overhead
• ORM hell, toxic magic of lazy loading
• Cache consistency hell
• Downstream changes - ugly CDC
• DDD aggregate roots
• in-memory when active
• recovered on demand
• Guaranteed durability and integrity
• Immediate downstream of updates
• 1973 - Actor model
• 2007 - Pat Helland (Amazon)
Life beyond
Distributed Transactions:
an Apostate’s Opinion
• 2009 - Akka by TypeSafe/LightBend
• 2023+ used in gaming, fintech, logistics...
• Actor guards the aggregate root
• strongly consistent
• transactional boundary
• Message box
• updates one-at-a time
• implemented without locking
• Stateful actors reside on multiple cluster
nodes
• Sharding/routing map
• Event-sourced persistence
• 1 Actor receives one message (one by one - from mailbox)
• 2 Message is validated against state
• 3 Valid message produces event
• 4 Event is written to durable storage
• 5 Event is applied to state
• 6 Side effects are triggered
• Sequence of immutable records
• Primary source of truth
• Append-only non-blocking writes
• SSTable-based storages:
Cassandra/Scylla, AWS Dynamo,
Google BigTable, Azure Cosmos
• Actors get deactivated...
• ...and then need to resurrect
• State recovered from the event journal
• Optionally, snapshots every N events
• ... for high-traffic and/or large states
• Exact copy of the state - wherever you want it
• ...CQRS - as it is meant to be
• Full or partial
• Which state do you need?
• Latest - need it fast, past not relevant
• Exact - can wait, need all of the events
• Relational or quasi-relational database
• ...Postgres, MySQL, ... Clickhouse
• States are vastly different
• Journal performance read vs. write
• Risk of split-brain
• Complexity of custom implementation
• Data lifecycle
• Lack of experienced engineers
• No single platform covering everything
• State size - 100's of bytes to some MBytes
• State event flows - 1/hour to 1000/sec
• Downstream latency SLA - minutes to
milliseconds
• Can happen in any multi-node cluster
• Two instances of single stateful actor
- two versions of reality
• Mitigation by automatic detection/recovery
• Downstream latency is critical for CQRS
• ... p99 <100ms (better <10ms)
• Append-only storages do have their limits
• ...excellent at writes, poor at reads
• CDC with guarantees, single source of truth
• Solution:
• stream to indexed storage
• hybrid recovery with marker
• Look ma, no ORM!
• Good fit for functional programming
• (State, Message) => Seq[Event]
• (State, Event) => State
• (StateOld, StateNew, Event) =>
Seq[SideEffect]
• Projections/lenses
• Already optimised
• Processing in memory
• Read side separated from primary states
• Scale by adding nodes to running cluster
• size of one state must fit one process...
• ...or proceed to splitting of aggregate roots
• Reducing codec overhead (JSON -> binary)
• Scalable stateful high-load with guarantees
• ...fintech, logistics, gaming
• multi-billion cap companies
• Stack: Akka/Cassandra/Kafka (Java or
Scala)
• Open-source Kafka Journal (Evolution)
Backends of the Future

Backends of the Future

  • 1.
  • 2.
    • Information systemsfor business (of any kind) • Reflects entities of the real world • Keep and update state • Data aggregation and reporting
  • 3.
    • Bookkeepers managedbusiness data • Bookkeepers used tables • Tables are good for keeping state! • Meat brains love tables
  • 4.
    • State isstored in database in tables • Read-change-write on each request • Caches for performance • Works OK: for low loads or 99,9% reads
  • 5.
    • States getmore complex • States are updated more often • ...concurrently! • There are more and more states • Stream changes of states
  • 6.
    • DDD (since2003) • Bounded contexts • Aggregate roots • Ubiquitous language { "id": "1234", "name": "Grocery List", "items": [ { "id": "5678", "name": "Apples", "completed": false }, { "id": "9012", "name": "Milk", "completed": true }, { "id": "3456", "name": "Bread", "completed": false } ], "created_at": "2023-04-25T10:00:00Z", "updated_at": "2023-04-25T13:30:00Z" }
  • 7.
    • Modelled withDDD • But... how to store them? • Normalise to many tables? • Put all in single blob key-value? • Split in chunks?
  • 8.
    • Strong consistency- at storage level • Single master is SPoF and contention pain • ..master-master is complex and fragile • Locks to linearise updates
  • 9.
    • Unrelated entitiesstill share (indices) • Master single write + replicas for read • ... poor man's CQRS • Caches • ... and how to invalidate them? • Overhead of (de)serialisation
  • 10.
    • For logging,audits, read-only views... • CDC - on top of existing models • How to guarantee reliable delivery? • ...out of sequence?
  • 11.
    • No locks- no integrity • ... but locks kill performance • (De)serialisation overhead • ORM hell, toxic magic of lazy loading • Cache consistency hell • Downstream changes - ugly CDC
  • 12.
    • DDD aggregateroots • in-memory when active • recovered on demand • Guaranteed durability and integrity • Immediate downstream of updates
  • 13.
    • 1973 -Actor model • 2007 - Pat Helland (Amazon) Life beyond Distributed Transactions: an Apostate’s Opinion • 2009 - Akka by TypeSafe/LightBend • 2023+ used in gaming, fintech, logistics...
  • 14.
    • Actor guardsthe aggregate root • strongly consistent • transactional boundary • Message box • updates one-at-a time • implemented without locking
  • 15.
    • Stateful actorsreside on multiple cluster nodes • Sharding/routing map • Event-sourced persistence
  • 16.
    • 1 Actorreceives one message (one by one - from mailbox) • 2 Message is validated against state • 3 Valid message produces event • 4 Event is written to durable storage • 5 Event is applied to state • 6 Side effects are triggered
  • 17.
    • Sequence ofimmutable records • Primary source of truth • Append-only non-blocking writes • SSTable-based storages: Cassandra/Scylla, AWS Dynamo, Google BigTable, Azure Cosmos
  • 18.
    • Actors getdeactivated... • ...and then need to resurrect • State recovered from the event journal • Optionally, snapshots every N events • ... for high-traffic and/or large states
  • 19.
    • Exact copyof the state - wherever you want it • ...CQRS - as it is meant to be • Full or partial • Which state do you need? • Latest - need it fast, past not relevant • Exact - can wait, need all of the events • Relational or quasi-relational database • ...Postgres, MySQL, ... Clickhouse
  • 20.
    • States arevastly different • Journal performance read vs. write • Risk of split-brain • Complexity of custom implementation • Data lifecycle • Lack of experienced engineers • No single platform covering everything
  • 21.
    • State size- 100's of bytes to some MBytes • State event flows - 1/hour to 1000/sec • Downstream latency SLA - minutes to milliseconds
  • 22.
    • Can happenin any multi-node cluster • Two instances of single stateful actor - two versions of reality • Mitigation by automatic detection/recovery
  • 23.
    • Downstream latencyis critical for CQRS • ... p99 <100ms (better <10ms) • Append-only storages do have their limits • ...excellent at writes, poor at reads • CDC with guarantees, single source of truth • Solution: • stream to indexed storage • hybrid recovery with marker
  • 24.
    • Look ma,no ORM! • Good fit for functional programming • (State, Message) => Seq[Event] • (State, Event) => State • (StateOld, StateNew, Event) => Seq[SideEffect] • Projections/lenses
  • 25.
    • Already optimised •Processing in memory • Read side separated from primary states • Scale by adding nodes to running cluster • size of one state must fit one process... • ...or proceed to splitting of aggregate roots • Reducing codec overhead (JSON -> binary)
  • 26.
    • Scalable statefulhigh-load with guarantees • ...fintech, logistics, gaming • multi-billion cap companies • Stack: Akka/Cassandra/Kafka (Java or Scala) • Open-source Kafka Journal (Evolution)