Backends of the Future

• Information systems for business (of any kind)
• Reflects entities of the real world
• Keep and update state
• Data aggregation and reporting

• Bookkeepers managed business data
• Bookkeepers used tables
• Tables are good for keeping state!
• Meat brains love tables

• State is stored in database in tables
• Read-change-write on each request
• Caches for performance
• Works OK: for low loads or 99,9% reads

• States get more complex
• States are updated more often
• ...concurrently!
• There are more and more states
• Stream changes of states

• DDD (since 2003)
• Bounded contexts
• Aggregate roots
• Ubiquitous language
{
"id": "1234",
"name": "Grocery List",
"items": [
{
"id": "5678",
"name": "Apples",
"completed": false
},
{
"id": "9012",
"name": "Milk",
"completed": true
},
{
"id": "3456",
"name": "Bread",
"completed": false
}
],
"created_at": "2023-04-25T10:00:00Z",
"updated_at": "2023-04-25T13:30:00Z"
}

• Modelled with DDD
• But... how to store them?
• Normalise to many tables?
• Put all in single blob key-value?
• Split in chunks?

• Strong consistency - at storage level
• Single master is SPoF and contention pain
• ..master-master is complex and fragile
• Locks to linearise updates

• Unrelated entities still share (indices)
• Master single write + replicas for read
• ... poor man's CQRS
• Caches
• ... and how to invalidate them?
• Overhead of (de)serialisation

• For logging, audits, read-only views...
• CDC - on top of existing models
• How to guarantee reliable delivery?
• ...out of sequence?

• No locks - no integrity
• ... but locks kill performance
• (De)serialisation overhead
• ORM hell, toxic magic of lazy loading
• Cache consistency hell
• Downstream changes - ugly CDC

• DDD aggregate roots
• in-memory when active
• recovered on demand
• Guaranteed durability and integrity
• Immediate downstream of updates

• 1973 - Actor model
• 2007 - Pat Helland (Amazon)
Life beyond
Distributed Transactions:
an Apostate’s Opinion
• 2009 - Akka by TypeSafe/LightBend
• 2023+ used in gaming, fintech, logistics...

• Actor guards the aggregate root
• strongly consistent
• transactional boundary
• Message box
• updates one-at-a time
• implemented without locking

• Stateful actors reside on multiple cluster
nodes
• Sharding/routing map
• Event-sourced persistence

• 1 Actor receives one message (one by one - from mailbox)
• 2 Message is validated against state
• 3 Valid message produces event
• 4 Event is written to durable storage
• 5 Event is applied to state
• 6 Side effects are triggered

• Sequence of immutable records
• Primary source of truth
• Append-only non-blocking writes
• SSTable-based storages:
Cassandra/Scylla, AWS Dynamo,
Google BigTable, Azure Cosmos

• Actors get deactivated...
• ...and then need to resurrect
• State recovered from the event journal
• Optionally, snapshots every N events
• ... for high-traffic and/or large states

• Exact copy of the state - wherever you want it
• ...CQRS - as it is meant to be
• Full or partial
• Which state do you need?
• Latest - need it fast, past not relevant
• Exact - can wait, need all of the events
• Relational or quasi-relational database
• ...Postgres, MySQL, ... Clickhouse

• States are vastly different
• Journal performance read vs. write
• Risk of split-brain
• Complexity of custom implementation
• Data lifecycle
• Lack of experienced engineers
• No single platform covering everything

• State size - 100's of bytes to some MBytes
• State event flows - 1/hour to 1000/sec
• Downstream latency SLA - minutes to
milliseconds

• Can happen in any multi-node cluster
• Two instances of single stateful actor
- two versions of reality
• Mitigation by automatic detection/recovery

• Downstream latency is critical for CQRS
• ... p99 <100ms (better <10ms)
• Append-only storages do have their limits
• ...excellent at writes, poor at reads
• CDC with guarantees, single source of truth
• Solution:
• stream to indexed storage
• hybrid recovery with marker

• Look ma, no ORM!
• Good fit for functional programming
• (State, Message) => Seq[Event]
• (State, Event) => State
• (StateOld, StateNew, Event) =>
Seq[SideEffect]
• Projections/lenses

• Already optimised
• Processing in memory
• Read side separated from primary states
• Scale by adding nodes to running cluster
• size of one state must fit one process...
• ...or proceed to splitting of aggregate roots
• Reducing codec overhead (JSON -> binary)

• Scalable stateful high-load with guarantees
• ...fintech, logistics, gaming
• multi-billion cap companies
• Stack: Akka/Cassandra/Kafka (Java or
Scala)
• Open-source Kafka Journal (Evolution)

Backends of the Future

Recommended

Recommended

More Related Content

Similar to Backends of the Future

Similar to Backends of the Future (20)

Recently uploaded

Recently uploaded (20)

Backends of the Future