Event-Sourcing Microservices on the JVM

Event-Sourcing
Microservices
on the JVM at the
Norwegian Tax Authority

Concept
Implementation
Operation

0
Accountcredit 100
credit 50
debit 200
credit 150
2018/05/02 14:30
credited 100 to account
2018/05/02 18:15
2018/05/05 10:00
debitted 200 from account
2018/05/06 12:00
100150-50100
Logging events

0
Accountcredit 100
credit 50
debit 200
credit 150
100150-50100 0
Audit
100
150
-50
State auditing

0
Snapshot
100150-50100
credit 100
credit 50
debit 200
credit 150
credited 100
Events
credited 50
debited 200
credited 150
Event sourcing

0
Snapshot
100150-50100
credit 100
credit 50
debit 200
credit 150
credited 100
Events
credited 50
debited 200
credited 150
Event sourcing: resetting snapshots

0
Snapshot
100150300
debit 200
credit 150
credited 100
Events
credited 50
overrun
credited 150
credit 100
credit 50
Event sourcing: events and commands

Command query [responsibility] segregation (CQ[R]S)
commands
do not query state
queries
do not change state

Who is paying taxes in Norway?
passport
foreigner id
citizen id
international
locally registered
own
employ
ownrelate

“Partsregister”: tracking taxable entities in Norway
folkeregister
enhetsregister
Toll
Skatteetaten
event store
id (part)
management
searchdetails
relationshipsexports
legacy register
This is a schematic view only.

The promise of event-sourcing and our experience
• Event-sourcing allows you to easily change snapshot representation
• unless you did not sufficiently future-proof event capture
• Event-sourcing makes snapshots redundant by replaying events
unless the event-processing code changes
• Event-sourcing implies full auditability of your application
• unless an error happens during command-to-event processing
• Event-sourcing offers an easy way of debugging applications
• unless events are trivial compared to command input
• Event-sourcing is an easy gateway to share-nothing architecture
• but only if you could shard your data in the first place
Disclaimer: our approach could be described as a combination of event sourcing
and “command sourcing” with limited capability to scaling writes. But for us, this
solution works great!

1105995521418Some Man 1047100000Oslo 1503XXXXX185719...
1702842193749Some Woman 9755384654Drammen 9456 A529184...
1105995521494 00000Drammen 0000XXXXX000000...
1105995521494 00000Drammen 0000XXXXX000000...
1105995521494 00000Drammen 0000XXXXX000000...
1105995521494 00000Drammen 0000XXXXX000000...
1105995521494 00000Drammen 0000XXXXX000000...
1105995521494 00000Drammen 0000XXXXX000000...
folkeregister
Why “command sourcing”?
The presented file formats are simplified for didactical reasons.
event store
{
"fnr": "11059955214",
"name": "Some Man"
}
{
"fnr": "11059955214",
"name": "gome Man"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}

1105995521494 00000Drammen 0000XXXXX000000...
folkeregister
Persist events for mistakes that need explicit correction
The presented file formats are simplified for didactical reasons.
event store
{
"fnr": "11059955214",
"pnr": "9950174"
}

{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
Event-dependent state and sequence numbers
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
/part/9950174/1 /part/9950174
event store
11059955214
Some Man
Oslo
17028421937
Some Woman
Drammen
11059955214
Some Man
Drammen
sequence: 1 sequence: 2 sequence: 3
11059955214
Some Man
Oslo
11059955214
Some Man
Drammen
/part/9950174/3/part/9950174/2

/rel/7573509
Using sequence numbers for dealing with eventual consistency
X-Sequence: 2
{
"owner": "9950174"
}
/part/9950174/2
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
event store
last-event: 3 last-event: 2

/rel/7573509
Using sequence numbers for dealing with eventual consistency
X-Sequence: 3
{
"owner": "9950174"
}
/part/9950174/3
BAD REQUEST:
{
"sequence": "2"
}
event store
last-event: 2 last-event: 3

9950174
sequence 3
9950174
sequence 1
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "grammen"
}
event store
11059955214
Some Man
Oslo
17028421937
Some Woman
Drammen
11059955214
Some Man
Drammen
11059955214
Some Man
Oslo
11059955214
Some Man
Drammen
Publishing thin change feeds to expose application state
294851
sequence 2
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"name": "Some Woman",
"city": "Drammen"
}
/part/9950174/1 /part/294851/2 /part/9950174/3
9950174
sequence 3

Revisioning aggregates for idempotency
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "grammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"city": "Drammen"
}
/part/9950174/1 /part/294851/2 /part/9950174/3
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"city": "Drammen"
}
diff diff diff
17028421937
Some Woman
Drammen
11059955214
Some Man
Oslo
11059955214
Some Man
Drammen
reprocess
/part/9950174/3/1
/part/9950174/3[/2]
/part/294851/2/1/part/9950174/1/1
event store

9950174
sequence 3
9950174
sequence 1
revision 1
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "grammen"
}
Publishing revisions in a feed
294851
sequence 2
revision 1
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"city": "Drammen"
}
/part/9950174/1/1 /part/294851/2/1 /part/9950174/3/1
9950174
sequence 3
revision 1
9950174
sequence 3
revision 2
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
/part/9950174/3/2

Using the event store as a single source of truth
event store
read commands read events
recover/replicate
events
write events
Advantages of "command sourcing":
1. Self-healing state after any bug fix without any user management.
2. Only command-to-event mapping is domain-specific code.
3. Minimal probability to misinterpret events after updates.
Downside: command-to-event processing must be stateless to allow reprocessing.
Revision-sensitive event observers can often remedy this limitation.

Event UIDs for idempotency of write operations
1105995521494 00000Drammen 0000XXXXX000000...
event store
Fnr:
11059955214
Event id:
18
Name:
Some Man
City:
Oslo
Fnr:
17028421937
Event id:
49
Name:
Some Woman
City:
Drammen
Fnr:
1105995521
Event id:
94
City:
Drammen
folkeregister
fr:fileABC:1 fr:fileABC:2 fr:fileABC:3
Unique keys can also be chosen as UUIDs for live commands.

/part/749572
{
"name": "Some Company"
}
{
"name": "Some Company",
"last_id": "gf01Ha"
}
{
"name": "Some Company",
"last_id": "df57Ha"
}
Part:
749572
Name:
Other Name
Part:
749572
Name:
Other Name
Last id:
gf01Ha
Part:
749572
Name:
Yet Another Name
Part:
749572
Name:
Yet Another name
Last id:
gf01Ha
Using event UIDs as optimistic locks
event store
46sjGF
df57fF
/part/749572
Part:
749572
Name:
Other Name
Last id:
gf01Ha
Part:
749572
Name:
Yet Another name
Last id:
gf01Ha
df57fF
/part/749572/df57fF
/part/749572/46sjGF
Event UIDs are non-numeric to avoid confusion with sequence numbers.

Deleting events and compaction events
Why would you want to delete events?
1. Because you want.
Storage space is not free after all.
2. Because you should.
Storing obsolete personal data makes you a target for attackers and is immoral.
3. Because you have to.
Laws like the GDPR demand physical erasure.

Deleting events with tombstones
event store
17028421937
Some Woman
Drammen
11059955214
Some Man
Oslo
11059955214
Some Man
Drammen
11059955214
[tombstone]
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"city": "Drammen"
}
/part/9950174/1 /part/294851/2 /part/9950174/3
Tombstones must not be deleted themselves to allow for propagation to all services.
For this reason, it is crucial to choose primary identificators that do not contain personal data (unlike a fødselsnummer).
Ideally, an internal, synthetic identificator is used as a proxy for each personal identificator.

Compacting events with compaction events
17028421937
Some Woman
Drammen
11059955214
Some Man
Oslo
11059955214
Some Man
Drammen
11059955214
Some Man
Drammen
[compaction: 3]
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Drammen"
}
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo"
}
{
"fnr": "17028421937",
"city": "Drammen"
}
/part/9950174/1 /part/294851/2 /part/9950174/3
{
"fnr": "11059955214",
"name": "Some Man",
"city": "Oslo",
"compacted": "3"
}
/part/9950174/1
Can be represented by same database entity.
event store

What is out there?
API-wapper for MongoDB.
Originates from the .NET space.
Java client but Scala-oriented.
Java framework for CQRS.
Strict command and event seperation.
Support for JDBC-integration.
Append-only database.
Only recently published.
DIY at Skatteetaten. Reasons for choice:
1. Performance.
Streaming has a high overhead for mass processing.
Need for microbatching to allow for microservice orchestration.
2. Complexity
Event sourcing is not yet mainstream. APIs feel often immature.
Event stores often aim for distributability at the cost of simplicity.
3. Loose command-to-aggregate mapping
Many frameworks assume that there exists an obvious mapping
of any command to an aggregate.

class Event {
long sequence; // 0 if not set
String uid;
String id;
String type; // XML namespace id
String value; // XML
}
Events and event stores
interface EventStore {
Stream<Event> read(long afterSequence);
ClosableConsumer<Event> write();
}
EventStore source, target;
try (Stream<Event> stream = source.read(0);
ClosableConsumer<Event> consumer = target.write()) {
stream.forEach(consumer);
}

class SQLEventStore implements EventStore
class InMemoryEventStore implements EventStore
class HttpEventStore implements EventStore
Events and event stores
LOCK TABLE events;
INSERT INTO events (sequence, uid, id, type, value)
SELECT seq.NEXTVAL, ?, ?, ?, ?
FROM dual
WHERE ? NOT IN (SELECT uid FROM events)
SELECT *
FROM events
WHERE seq > 0
FETCH FIRST 1000 ROWS ONLY
SELECT /*+ index(events seq) */ *
FROM events
WHERE seq > 0
FETCH FIRST 1000 ROWS ONLY

interface AggregateStore {
Optional<String> read(String id, long sequence);
}
interface WriteableAggregateStore extends AggregateStore {
void write(String id, long sequence, String aggregate);
}
Aggregates and aggregate stores
EventStore source;
AggregateStore target;
try (Stream<Event> stream = source.read(0)) {
stream.forEach(event -> {
String aggregate = target.read(event.id, event.sequence)
.map(aggregate -> Domain.updateAggregate(aggregate, event.value))
.orElse(() -> Domain.newAggregate(event.value));
target.write(event.id, event.sequence, aggregate);
});
}

class SQLAggregateStore implements WriteableAggregateStore
class InMemoryAggregateStore implements WritableAggregateStore
class HttpAggregateStore implements AggregateStore
Aggregates and aggregate stores
SELECT s.id, s.value
FROM aggregates s
INNER JOIN (
SELECT MAX(sequence) ms, id
FROM aggregates
WHERE sequence <= ?
GROUP BY id
) t
ON s.id = t.id
AND s.sequence = t.ms
WHERE id = ?
INSERT INTO aggregates (sequencee, id, valuee)
VALUES (?, ?, ?)

Testing
event store
<events>
<event>
<id>fnsdjFD94d</id>
<type>sample-event</type>
<value>some-event</value>
</event>
</events>
{
"state": "some-event"
}
supply
assert

Test-automation
test: example
timeout: 10000
applications:
- eventstore
- identity-management
- folkeregister-export
given:
- application: eventstore
POST: ./some-events.xml
when:
- application: folkeregister-export
GET: /info/sequence
text: 10
then:
- application: folkeregister-export
GET: /part/947652
json: ./some-result.json
example.yml
~$ part-test-cli example.yml
PartTestRunner extends JUnitRunner

/events/1380/events/0/events/1000
Polling or pushing events
event store
event store
/socket/0
require 1000
event store
/events/1320
/udp/broadcast
Interval-polling:
- Adds network overhead
- Interval adds latency
- Serves as a heartbeat
- Simple and works well with few consumers
Websockets:
- Allows for reactive programming
- Fast processing can break micro-batching
- Optimizes for low-latency
- Slow in instable networks
Broadcast-triggered polling:
- Avoids long-lasting connections
- Scales better with consumer count
- Adds latency in instable networks
- Broadcast can be delayed under high load

Things to mention
1. We do not process single events.
Instead of real streaming, we apply "micro-batching".
Without, HTTP calls between microservices would hang up our system.
2. We do not use transactions.
In case of an error, we simply reset an aggregate store the last known sequence id.
This also alows us to use multiple databases such as Oracle/Elasticsearch without XA.
3. We have cut some corners.
To save time and money, not everything presented is implemented at Skatteetaten.
4. Asynchronicity and eventual-consistency are optional concepts.
By processing messages as they arrive, it is possible to implement an event-sourced
application without eventual consistent state.

The event store as a bottleneck
writer 1
writer 2 reader 2
reader 1
event store

Scaling reads by event store replication
writer 1
writer 2
reader 2
reader 1
event store
event store
mirror 2
event store
mirror 1
reader 4
reader 3

Scaling reads by splitting reader responsibility
writer 1
writer 2
reader 1
(aggregator)
event store
reader 2
(aggregator)
reader 1(API)
reader 1 (API)
reader 2 (API)
reader 2 (API)

Scaling writes via buffers (with priority)
writer 1
writer 2 reader 2
reader 1
event store
buffer 2
buffer 1

Share-nothing event store
writer 1
writer 2 reader 2
reader 1
event store
(key space 1)
event store
(key space 2)
broadcast
expiration
requests
redirect
requests
sequence mod: 1
sequence mod: 0
ks1
ks1
ks2
ks2

Observing event-processing of distributed services

Things to mention
1. Full partitioning (sharing) conflicts with a total store order of all events.
Message log systems such as Kafka use partitions to achieve performance.
This might hinder future services that want to aggregate events of different partitions.
2. Beware of time-based sequencing.
Databases like MongoDB generate ordering ids based on the system clock.
Timers are not fully reliable, even when using NTP.
3. We split reader responsibility into aggregator and API for blue/green deployment.
As we do not require full versioning of all components, a parallel deployment of an
application allows to recreate a "fixed" version that can replace an older version.
4. Operating microservices requires a significant amount of resources.
HTTP and (un-)marshalling are expensive operations. While enabling scalability,
distributed architecture requires a baseline of additional resources to match the
level of centralized applications.

http://rafael.codes
@rafaelcodes
http://documents4j.com
https://github.com/documents4j/documents4j
http://bytebuddy.net
https://github.com/raphw/byte-buddy

Event-Sourcing Microservices on the JVM

Recommended

Recommended

More Related Content

More from Rafael Winterhalter

More from Rafael Winterhalter (8)

Recently uploaded

Recently uploaded (20)

Event-Sourcing Microservices on the JVM