Microservices are one of those polarising concepts that technologists either love or hate. Splitting applications into autonomous units clearly has advantages, but larger service-based systems tend to struggle as the interactions between the services grow.
At the core of this sits a dichotomy: Data systems are designed to make data as accessible as possible. Services, on the other hand, actively encapsulate. These two forces inevitably compete in the architectures we build. By understanding this dichotomy we can better reason about how services should be sewn together. We strike a balance between the ability to adapt quickly and the loose coupling we need to retain autonomy, long term.
In this talk we'll examine how Stateful Stream Processing can be used to build Event Driven Services, using a distributed log like Apache Kafka. In doing so this Data-Dichotomy is balanced with an architecture that exhibits demonstrably better scaling properties, be it increased complexity, team size, data volume or velocity.
19. Single Sign On Business Serviceauthorise(),
It’s unlikely that a Business
Service would need the
internal SSO state /
function to change
Single Sign On
26. Service Database
Data on
inside
Data on
outside
Data on
inside
Data on
outside
Interface
hides data
Interface
amplifies
data
Databases amplify the data they hold
32. Service
Interface
Database
Either (1) we constantly add to the interface,
as datasets grow
getOpenOrders(
fulfilled=false,
deliveryLocation=CA,
orderValue=100,
operator=GreatherThan)
39. Nice neat
services
Service
contract too
limiting
Can we change
both services
together, easily?
NO
Broaden Contract
Eek $$$
NO
YES
Is it a
shared
Database?
Frack it!
Just give me
ALL the data
Data diverges.
(many different
versions of the same
facts)
Lets encapsulate
Lets centralise
to one copy
YES
Start
here
Cycle of inadequacy:
43. => a set of very different architectures
Event
(widget ordered)
Query
(Get my orders)
State
change
No state
change
Request driven Event Driven
Logic
Aside: Commands, Events & Queries
Command
(order widget)
55. Window / Tables Cached on disk in your Service
stream
Compacted
stream
Join
Stream Data
Stream-Tabular
Data
Domain Logic
Tables
/
Views
Kafka
Event Driven Service
56. Avg(o.time – p.time)
From orders, payment, user
Group by user.region
over 1 day window
emitting every second
Streams & Tables In
Streams & Tables Out
Streams
Stream Processing Engine
Table
Derived “Table”
57. Join & Process the outputs of many
different services
Email Service
Legacy App
Orders Payments Stock
KSTREAMS
Kafka
72. --------------------------- World Wide Web ---------------------------
Payment
Gateway
FinanceFulfillment
Web
Query
Web
Command
Start with an event-driven data layer,
overlay SSP to (a) process events and (b) create views
80. Set your data free!
Data should be available.
Data should be under your control.
81. • Broadcast events
• Retain them in the log
• Compose functions with a
streaming engine
• Build views when you need to
Event Driven Services
82. WIRED Principals
• Wimpy: Start simple, lightweight and fault tolerant.
• Immutable: Build a retentive, shared narrative.
• Reactive: Leverage Asynchronicity. Focus on the now.
• Evolutionary: Use only the data you need today.
• Decentralized: Receiver driven. Coordination avoiding. No
God services
83. References
• Stopford: The Data Dichotomy:
https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/
• Kleppmann: Turning the Database Inside Out:
https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/
• Helland: Immutability Changes Everything: http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
• Kreps: The Log:
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-
know-about-real-time-datas-unifying
• Narkhede: Event Sourcing, CQRS & Stream Processing:
https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-
connection/