Devoxx London 2017 - Rethinking Services With Stateful Streams
Microservices are one of those polarising concepts that technologists either love or hate. Splitting applications into autonomous units clearly has advantages, but larger service-based systems tend to struggle as the interactions between the services grow.
At the core of this sits a dichotomy: Data systems are designed to make data as accessible as possible. Services, on the other hand, actively encapsulate. These two forces inevitably compete in the architectures we build. By understanding this dichotomy we can better reason about how services should be sewn together. We strike a balance between the ability to adapt quickly and the loose coupling we need to retain autonomy, long term.
In this talk we'll examine how Stateful Stream Processing can be used to build Event Driven Services, using a distributed log like Apache Kafka. In doing so this Data-Dichotomy is balanced with an architecture that exhibits demonstrably better scaling properties, be it increased complexity, team size, data volume or velocity.
Less Surface Area = Less Coupling
Consider Two Objects in one address space
Encapsulation => Loose Coupling
apps are easy
Orders Service Statement ServicegetOpenOrders(),
Independently Deployable Independently Deployable
changes are painful
Services work best
are isolated in a single
Single Sign On Business Serviceauthorise(),
It’s unlikely that a Business
Service would need the
internal SSO state /
function to change
Single Sign On
Data diverges over time
the more data will
diverge over time
Can we change
Is it a
Just give me
ALL the data
versions of the same
to one copy
Cycle of inadequacy:
These forces compete in
the systems we build
Two core patterns
of system design
Request Driven Event Driven
=> a set of very different architectures
(Get my orders)
Request driven Event Driven
Aside: Commands, Events & Queries
Events are the
Cached Views Suffice
LEAVE DATA IN KAFKA LONG TERM
-> Services only need caches
Now add Stream Processing
A machine for combining and
processing streams of events
A Query Engine inside your service
State Store Outbound Stream
Built in disk overflow – A maintained cache
embedded inside your service
Window / Tables Cached on disk in your Service
Event Driven Service
Avg(o.time – p.time)
From orders, payment, user
Group by user.region
over 1 day window
emitting every second
Streams & Tables In
Streams & Tables Out
Stream Processing Engine
Join & Process the outputs of many
Orders Payments Stock
--------------------------- World Wide Web ---------------------------
Start with an event-driven data layer,
overlay SSP to (a) process events and (b) create views
Set your data free!
Data should be available.
Data should be under your control.
• Broadcast events
• Retain them in the log
• Compose functions with a
• Build views when you need to
Event Driven Services
• Wimpy: Start simple, lightweight and fault tolerant.
• Immutable: Build a retentive, shared narrative.
• Reactive: Leverage Asynchronicity. Focus on the now.
• Evolutionary: Use only the data you need today.
• Decentralized: Receiver driven. Coordination avoiding. No
• Stopford: The Data Dichotomy:
• Kleppmann: Turning the Database Inside Out:
• Helland: Immutability Changes Everything: http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
• Kreps: The Log:
• Narkhede: Event Sourcing, CQRS & Stream Processing:
Blog of this talk at