Day 2 Problems in
CQRS & Event Sourcing
A from-the-trenches talk from an Axon user
-- Joris Kuipers
About Me
 Joris Kuipers
 @jkuipers
 CTO,
hands-on architect and
fly-by-night Spring trainer
@ Trifork Amsterdam
Disclaimer
 More than 5 years of Axon experience
 But last hands-on over a year ago already
 Currently working on a stateless set of
microservices
 I highly recommend making state someone else’s
problem
 May not have all the answers, but have lots of
questions you hadn’t thought about yet…
Why CQRS & Event Sourcing?
 Allows DDD approach, focus on business
 Scale reads & writes separately
 Audit trails
 Create new projections after the fact
 Decouples event publishers and consumers,
allows splitting monoliths into service
After the Honeymoon Phase is
Over
 Business changed its mind, remodel aggregate
split
 Downtime with every deployment
 And we have to deploy all services at once!
 What used to be a Flyway migration is now a
custom tool firing custom commands
Deploying Typical Axon App Update
What goes into a typical update in terms of CQRS
and event sourcing?
 New events introduced
 Existing events expanded
 Aggregate state expanded
 Query projections added / updated
Introducing New Events
 Problem: old version of the app doesn’t know
about them
 Might choke on them in case of blue/green deploy or
rollback
 Problem: external event processors don’t
know about them
 Might choke on them if not updated ahead of time
Postel’s Law
Kuipers’ Events Corollary
Be conservative in what
you publish publicly, be
liberal in what you accept
from your past & future self
Eventing Rule Of Thumb
 “If you don’t know about it,
you probably don’t care about it”
 Ignore what you don’t know
 Don’t throw exceptions for unknown event types or
unexpected fields
 Generally works well for external event processors
 Needs forwards compatibility from old app
versions
App Version Compatibility
 By default event sourcing fails for unknown event
types -> old versions can’t deal
 Don’t just ignore unknown events:
you have to design for forwards compatibility
 This is super hard and requires tons of testing
 My experience: often not done
 Just accept downtime
 Aim for roll-forward only, emergency backups for rollbacks
External Event Processor Tips
 Tag event types and have stable set of base
fields
 Consumers can make sense of events without
knowing or caring about their details
 Works very well when using events as triggers
 “Need to inform 3rd parties that something changed
in medicine prescriptions-related info of medical
record for patient with SSN xxx”
Unmarshalling Tagged Events
 Requires customized setup
 When exact type unavailable, create some
type of base event
 Can’t assume consumers have complete knowledge
of Java type hierarchy
 Also useful for non-Java consumers
External Event Processor Tips
 Distinguish between internal and external
public events
 Latter need to be more stable: treat like API
 Contract-first and schema evolution rules apply
 Consider explicit milestone events
 Lower granularity, reduces coupling by not requiring
consumers to share business logic
Changed Events
 New fields may be unknown to consumers
 Easier than unknown event types typically:
simply ignore
 Should be trivial to configure in unmarshallers
Expanded Aggregate State
 Aggregates might have additional state in new
app version
 Updated business logic needing data not kept before
 Means old snapshots no longer apply
 Leads to two problems:
 New version needs to recreate all snapshots
 Blue-green or rollback requires different snapshots
per version
Event Sourcing Overhead
 Pure event sourcing is ridiculously inefficient
 Basically: answer “how are you” by reliving every day
since you were born
 Snapshots are often not just an optimization,
but hard necessity
 Also: think about aggregate caching!
Dealing With Invalid Snapshots
 Easiest: just truncate old snapshots, let new
ones be created on the fly
 Doesn’t work for blue-green, but you probably weren’t
doing that anyway
 Doesn’t scale with many events per aggregate
 Create new snapshots up-front using (part of) new app
 Good news: Axon 4+ supports versioned
snapshots
Query Projection Changes
 Very common during updates
 Possible solution: delete and recreate using
replay
 Became easier with tracking event processors
 Like snapshots, often not realistic to do on the fly /
during deploy
Query Projection Changes
 In reality: custom commands and events might
be necessary
 Way more efficient
 Can even cheat and update query projections
directly
 Remember how easy CRUD apps were?
 Ensure that replays remain possible by also having
events when needed
Deployment Planning
 Update external event handlers first
 Then update core app / service
 Then update clients sending new commands
 Think about if / when tool might need to run
 Pre deploy, e.g. to create new snapshots
 As part of deploy, e.g. to update projections
 Post deploy, e.g. to fire custom commands
Deployment Automation
 Inter-dependencies and need for
custom tooling complicates deploys
 Automate what you can
 Document and practice what you can’t
 One-offs not always worth to fully automate
 Build a framework over time
Conclusion
 Architecture is about making trade-offs
 Think about when CQRS / Event Sourcing pays off
 Consider the need for CD / no-downtime deploys
 Plan for the additional complexity
 Take Kuipers’ corollary into account
 Test, don’t just hope for the best
 Share your solutions!

Day 2 Problems in CQRS & Event Sourcing

  • 1.
    Day 2 Problemsin CQRS & Event Sourcing A from-the-trenches talk from an Axon user -- Joris Kuipers
  • 2.
    About Me  JorisKuipers  @jkuipers  CTO, hands-on architect and fly-by-night Spring trainer @ Trifork Amsterdam
  • 3.
    Disclaimer  More than5 years of Axon experience  But last hands-on over a year ago already  Currently working on a stateless set of microservices  I highly recommend making state someone else’s problem  May not have all the answers, but have lots of questions you hadn’t thought about yet…
  • 4.
    Why CQRS &Event Sourcing?  Allows DDD approach, focus on business  Scale reads & writes separately  Audit trails  Create new projections after the fact  Decouples event publishers and consumers, allows splitting monoliths into service
  • 5.
    After the HoneymoonPhase is Over  Business changed its mind, remodel aggregate split  Downtime with every deployment  And we have to deploy all services at once!  What used to be a Flyway migration is now a custom tool firing custom commands
  • 6.
    Deploying Typical AxonApp Update What goes into a typical update in terms of CQRS and event sourcing?  New events introduced  Existing events expanded  Aggregate state expanded  Query projections added / updated
  • 7.
    Introducing New Events Problem: old version of the app doesn’t know about them  Might choke on them in case of blue/green deploy or rollback  Problem: external event processors don’t know about them  Might choke on them if not updated ahead of time
  • 8.
  • 9.
    Kuipers’ Events Corollary Beconservative in what you publish publicly, be liberal in what you accept from your past & future self
  • 10.
    Eventing Rule OfThumb  “If you don’t know about it, you probably don’t care about it”  Ignore what you don’t know  Don’t throw exceptions for unknown event types or unexpected fields  Generally works well for external event processors  Needs forwards compatibility from old app versions
  • 11.
    App Version Compatibility By default event sourcing fails for unknown event types -> old versions can’t deal  Don’t just ignore unknown events: you have to design for forwards compatibility  This is super hard and requires tons of testing  My experience: often not done  Just accept downtime  Aim for roll-forward only, emergency backups for rollbacks
  • 12.
    External Event ProcessorTips  Tag event types and have stable set of base fields  Consumers can make sense of events without knowing or caring about their details  Works very well when using events as triggers  “Need to inform 3rd parties that something changed in medicine prescriptions-related info of medical record for patient with SSN xxx”
  • 13.
    Unmarshalling Tagged Events Requires customized setup  When exact type unavailable, create some type of base event  Can’t assume consumers have complete knowledge of Java type hierarchy  Also useful for non-Java consumers
  • 14.
    External Event ProcessorTips  Distinguish between internal and external public events  Latter need to be more stable: treat like API  Contract-first and schema evolution rules apply  Consider explicit milestone events  Lower granularity, reduces coupling by not requiring consumers to share business logic
  • 15.
    Changed Events  Newfields may be unknown to consumers  Easier than unknown event types typically: simply ignore  Should be trivial to configure in unmarshallers
  • 16.
    Expanded Aggregate State Aggregates might have additional state in new app version  Updated business logic needing data not kept before  Means old snapshots no longer apply  Leads to two problems:  New version needs to recreate all snapshots  Blue-green or rollback requires different snapshots per version
  • 17.
    Event Sourcing Overhead Pure event sourcing is ridiculously inefficient  Basically: answer “how are you” by reliving every day since you were born  Snapshots are often not just an optimization, but hard necessity  Also: think about aggregate caching!
  • 18.
    Dealing With InvalidSnapshots  Easiest: just truncate old snapshots, let new ones be created on the fly  Doesn’t work for blue-green, but you probably weren’t doing that anyway  Doesn’t scale with many events per aggregate  Create new snapshots up-front using (part of) new app  Good news: Axon 4+ supports versioned snapshots
  • 19.
    Query Projection Changes Very common during updates  Possible solution: delete and recreate using replay  Became easier with tracking event processors  Like snapshots, often not realistic to do on the fly / during deploy
  • 20.
    Query Projection Changes In reality: custom commands and events might be necessary  Way more efficient  Can even cheat and update query projections directly  Remember how easy CRUD apps were?  Ensure that replays remain possible by also having events when needed
  • 21.
    Deployment Planning  Updateexternal event handlers first  Then update core app / service  Then update clients sending new commands  Think about if / when tool might need to run  Pre deploy, e.g. to create new snapshots  As part of deploy, e.g. to update projections  Post deploy, e.g. to fire custom commands
  • 22.
    Deployment Automation  Inter-dependenciesand need for custom tooling complicates deploys  Automate what you can  Document and practice what you can’t  One-offs not always worth to fully automate  Build a framework over time
  • 23.
    Conclusion  Architecture isabout making trade-offs  Think about when CQRS / Event Sourcing pays off  Consider the need for CD / no-downtime deploys  Plan for the additional complexity  Take Kuipers’ corollary into account  Test, don’t just hope for the best  Share your solutions!