Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Rethinking Services with
Stateful Streams
Ben Stopford
@benstopford
What are
microservices
really about?
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Splitting the Monolith?
Sin...
Orders
Service
Autonomy?
Orders
Service
Stock
Service
Email
Service
Fulfillment
Service
Independently
Deployable
Independently
Deployable
Independe...
Independence is
where services get
their value
Allows Scaling
Scaling in terms of people
What happens when we grow?
Companies are inevitably a
collection of applications
They must work together to some degree
Interconnection is an
afterthought
FTP / Enterprise Messaging etc
Internal World
External world
External world
External World
External world
Service
Boundary
Services Force Us To Consider ...
Internal World
External world
External world
External World
External world
Service
Boundary
External World is something we...
Independence
comes at a cost
$$$
Orders
Object
Statement
ObjectgetOpenOrders(),
Less Surface Area = Less Coupling
Encapsulate
State
Encapsulate
Functionali...
Change 1
Change 2
Redeploy
Singly-deployable
apps are easy
Orders Service Statement ServicegetOpenOrders(),
Independently Deployable Independently Deployable
Synchronized
changes ar...
Services work best
where requirements
are isolated in a single
bounded context
Single Sign On Business Serviceauthorise(),
It’s unlikely that a Business
Service would need the
internal SSO state /
func...
SSO has a tightly
bounded context
But business
services are
different
Catalog Authorisation
Most business services share the
same core datasets.
Most services
live in here
The futures of
business services
are far more
tightly intertwined.
We need encapsulation to hide
internal state. Be loosely coupled.
But we need the freedom to slice
& dice shared data like...
But data systems
have little to do
with encapsulation
Service Database
Data on
inside
Data on
outside
Data on
inside
Data on
outside
Interface
hides data
Interface
amplifies
da...
The data dichotomy
Data systems are about exposing data.
Services are about hiding it.
Microservices shouldn’t
share a database
Good Advice!
So what do we do
instead?
Service
Interface
Database
Data on
inside
Data on
outside
We wrap a database in a service
interface
One of two things
happens next
Service
Interface
Database
Either (1) we constantly add to the interface,
as datasets grow
getOpenOrders(
fulfilled=false,...
getOrder(Id)
getOrder(UserId)
getAllOpenOrders()
getAllOrdersUnfulfilled(Id)
getAllOrders()
(1) Services can end up lookin...
...and DATA amplifies
this “Data-Service”
problem
(2) Give up and move whole datasets
en masse
getAllOrders()
Data Copied
Internally
Data Service Business Service
This leads to a
different problem
Data diverges over time
Orders
Service
Stock
Service
Email
Service
Fulfill-
ment
Service
The more
mutable copies,
the more data will
diverge over time
Nice neat
services
Service
contract too
limiting
Can we change
both services
together, easily?
NO
Broaden Contract
Eek $$$...
These forces compete in
the systems we build
Divergence
Accessibility
Coupling
Is there a better
way?
Two core patterns
of system design
Request Driven Event Driven
Logic
=> a set of very different architectures
Event
(widget ordered)
Query
(Get my orders)
State
change
No state
change
Request...
Request Driven -> highest Coupling
Commands &
Queries
Event Driven:
Interact through Events,
Don't talk to Services
Orders
Service
Bus
Event Broadcast => lowest coupling
Stream of
events
Do whatever
I like J
Very
Independent
Kafka changes things a bit
Kafka: a Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
Service Backbone
Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Retentive
A place to keep the
data-on-the-outside
Orders Customers
Payments
Stock
Events are the
shared dataset
Cached Views Suffice
LEAVE DATA IN KAFKA LONG TERM
-> Servic...
Now add Stream Processing
A machine for combining and
processing streams of events
A Query Engine inside your service
Join Filter
Aggr-
egate
View
Window
Kafka
KTable
Inbound
Stream
(Windowed)
State Store Outbound Stream
Changelog
Stream
DSL!
Built in disk overflow – A mainta...
Window / Tables Cached on disk in your Service
stream
Compacted
stream
Join
Stream Data
Stream-Tabular
Data
Domain Logic
T...
Avg(o.time – p.time)
From orders, payment, user
Group by user.region
over 1 day window
emitting every second
Streams & Tab...
Join & Process the outputs of many
different services
Email Service
Legacy App
Orders Payments Stock
KSTREAMS
Kafka
Two core
types of event
driven service
Type 1: Event Service
PaymentsOrders
Stock
Event Driven Service
Join outputs from
many other services
Orders
Service
Kafka
Interactive Queries API
Context
Boundary
Type 2: Query Service
So we have
shared storage in
the Log, and a
query engine
layered on top
Data Storage + Query Engine == Database?
Data lives here
KAFKA
MY SERVICE
Query lives
here
Customer
Alteration
Service
MI Fraud Fulfillment
Data lives
here
KAFKA
UI View
Queries
processed in
each service
1 x Data ...
A Database, Inside Out
Historical
Streams
Services
embed a
view
Services
embed a
view
Services embed a view
Services embed...
Microservices
shouldn’t share a
database
But this isn’t a
normal database
Orders
Service
Stock
Service
Kafka
Fulfilment
Service
UI
Service
Fraud
Service
Remember: Event Broadcast
has the lowest co...
It decentralizes
responsibility for
query processing
Orders
Service
Stock
Service
Kafka
Fulfilment
Service
UI
Service
Fraud
Service
Centralizing immutable data
doesn’t affect ...
To share a database,
turn it inside out!
Shared
database
Service
Interfaces
Log-Backed
Streaming
J!
L!
Event
Broadcast
Coupling
(Autonomy)
Accessibility Integrity
...
--------------------------- World Wide Web ---------------------------
Payment
Gateway
FinanceFulfillment
Web
Query
Web
Co...
So…
Good architectures have little to do
with this:
It’s about how systems
evolves over time
Hang simple
services in a
decoupled
ecosystem
Ecosystem that embraces
shared data
Historical
Streams
Break the ties of request
driven approaches
Designing for DATA on the outside
Set your data free!
Data should be available.
Data should be under your control.
•  Broadcast events
•  Retain them in the log
•  Compose functions with a
streaming engine
•  Build views when you need to...
WIRED Principals
•  Wimpy: Start simple, lightweight and fault tolerant.
•  Immutable: Build a retentive, shared narrative...
References
•  Stopford: The Data Dichotomy:
https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-...
Twitter:
@benstopford
Blog of this talk at
http://confluent.io/blog
Devoxx London 2017 - Rethinking Services With Stateful Streams
Upcoming SlideShare
Loading in …5
×

Devoxx London 2017 - Rethinking Services With Stateful Streams

4,977 views

Published on

Microservices are one of those polarising concepts that technologists either love or hate. Splitting applications into autonomous units clearly has advantages, but larger service-based systems tend to struggle as the interactions between the services grow.

At the core of this sits a dichotomy: Data systems are designed to make data as accessible as possible. Services, on the other hand, actively encapsulate. These two forces inevitably compete in the architectures we build. By understanding this dichotomy we can better reason about how services should be sewn together. We strike a balance between the ability to adapt quickly and the loose coupling we need to retain autonomy, long term.

In this talk we'll examine how Stateful Stream Processing can be used to build Event Driven Services, using a distributed log like Apache Kafka. In doing so this Data-Dichotomy is balanced with an architecture that exhibits demonstrably better scaling properties, be it increased complexity, team size, data volume or velocity.

Published in: Technology

Devoxx London 2017 - Rethinking Services With Stateful Streams

  1. 1. Rethinking Services with Stateful Streams Ben Stopford @benstopford
  2. 2. What are microservices really about?
  3. 3. GUI UI Service Orders Service Returns Service Fulfilment Service Payment Service Stock Service Splitting the Monolith? Single Process /Code base /Deployment Many Processes /Code bases /Deployments
  4. 4. Orders Service Autonomy?
  5. 5. Orders Service Stock Service Email Service Fulfillment Service Independently Deployable Independently Deployable Independently Deployable Independently Deployable
  6. 6. Independence is where services get their value
  7. 7. Allows Scaling
  8. 8. Scaling in terms of people What happens when we grow?
  9. 9. Companies are inevitably a collection of applications They must work together to some degree
  10. 10. Interconnection is an afterthought FTP / Enterprise Messaging etc
  11. 11. Internal World External world External world External World External world Service Boundary Services Force Us To Consider The External World
  12. 12. Internal World External world External world External World External world Service Boundary External World is something we should Design For
  13. 13. Independence comes at a cost $$$
  14. 14. Orders Object Statement ObjectgetOpenOrders(), Less Surface Area = Less Coupling Encapsulate State Encapsulate Functionality Consider Two Objects in one address space Encapsulation => Loose Coupling
  15. 15. Change 1 Change 2 Redeploy Singly-deployable apps are easy
  16. 16. Orders Service Statement ServicegetOpenOrders(), Independently Deployable Independently Deployable Synchronized changes are painful
  17. 17. Services work best where requirements are isolated in a single bounded context
  18. 18. Single Sign On Business Serviceauthorise(), It’s unlikely that a Business Service would need the internal SSO state / function to change Single Sign On
  19. 19. SSO has a tightly bounded context
  20. 20. But business services are different
  21. 21. Catalog Authorisation Most business services share the same core datasets. Most services live in here
  22. 22. The futures of business services are far more tightly intertwined.
  23. 23. We need encapsulation to hide internal state. Be loosely coupled. But we need the freedom to slice & dice shared data like any other dataset
  24. 24. But data systems have little to do with encapsulation
  25. 25. Service Database Data on inside Data on outside Data on inside Data on outside Interface hides data Interface amplifies data Databases amplify the data they hold
  26. 26. The data dichotomy Data systems are about exposing data. Services are about hiding it.
  27. 27. Microservices shouldn’t share a database Good Advice!
  28. 28. So what do we do instead?
  29. 29. Service Interface Database Data on inside Data on outside We wrap a database in a service interface
  30. 30. One of two things happens next
  31. 31. Service Interface Database Either (1) we constantly add to the interface, as datasets grow getOpenOrders( fulfilled=false, deliveryLocation=CA, orderValue=100, operator=GreatherThan)
  32. 32. getOrder(Id) getOrder(UserId) getAllOpenOrders() getAllOrdersUnfulfilled(Id) getAllOrders() (1) Services can end up looking like kookie home-grown databases
  33. 33. ...and DATA amplifies this “Data-Service” problem
  34. 34. (2) Give up and move whole datasets en masse getAllOrders() Data Copied Internally Data Service Business Service
  35. 35. This leads to a different problem
  36. 36. Data diverges over time Orders Service Stock Service Email Service Fulfill- ment Service
  37. 37. The more mutable copies, the more data will diverge over time
  38. 38. Nice neat services Service contract too limiting Can we change both services together, easily? NO Broaden Contract Eek $$$ NO YES Is it a shared Database? Frack it! Just give me ALL the data Data diverges. (many different versions of the same facts) Lets encapsulate Lets centralise to one copy YES Start here Cycle of inadequacy:
  39. 39. These forces compete in the systems we build Divergence Accessibility Coupling
  40. 40. Is there a better way?
  41. 41. Two core patterns of system design Request Driven Event Driven Logic
  42. 42. => a set of very different architectures Event (widget ordered) Query (Get my orders) State change No state change Request driven Event Driven Logic Aside: Commands, Events & Queries Command (order widget)
  43. 43. Request Driven -> highest Coupling Commands & Queries
  44. 44. Event Driven: Interact through Events, Don't talk to Services
  45. 45. Orders Service Bus Event Broadcast => lowest coupling Stream of events Do whatever I like J Very Independent
  46. 46. Kafka changes things a bit
  47. 47. Kafka: a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  48. 48. Service Backbone Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Retentive
  49. 49. A place to keep the data-on-the-outside
  50. 50. Orders Customers Payments Stock Events are the shared dataset Cached Views Suffice LEAVE DATA IN KAFKA LONG TERM -> Services only need caches
  51. 51. Now add Stream Processing A machine for combining and processing streams of events
  52. 52. A Query Engine inside your service Join Filter Aggr- egate View Window
  53. 53. Kafka KTable Inbound Stream (Windowed) State Store Outbound Stream Changelog Stream DSL! Built in disk overflow – A maintained cache embedded inside your service Your Service Kafka
  54. 54. Window / Tables Cached on disk in your Service stream Compacted stream Join Stream Data Stream-Tabular Data Domain Logic Tables / Views Kafka Event Driven Service
  55. 55. Avg(o.time – p.time) From orders, payment, user Group by user.region over 1 day window emitting every second Streams & Tables In Streams & Tables Out Streams Stream Processing Engine Table Derived “Table”
  56. 56. Join & Process the outputs of many different services Email Service Legacy App Orders Payments Stock KSTREAMS Kafka
  57. 57. Two core types of event driven service
  58. 58. Type 1: Event Service PaymentsOrders Stock Event Driven Service Join outputs from many other services
  59. 59. Orders Service Kafka Interactive Queries API Context Boundary Type 2: Query Service
  60. 60. So we have shared storage in the Log, and a query engine layered on top
  61. 61. Data Storage + Query Engine == Database? Data lives here KAFKA MY SERVICE Query lives here
  62. 62. Customer Alteration Service MI Fraud Fulfillment Data lives here KAFKA UI View Queries processed in each service 1 x Data Storage + n x Query == Shared Database?
  63. 63. A Database, Inside Out Historical Streams Services embed a view Services embed a view Services embed a view Services embed a view
  64. 64. Microservices shouldn’t share a database
  65. 65. But this isn’t a normal database
  66. 66. Orders Service Stock Service Kafka Fulfilment Service UI Service Fraud Service Remember: Event Broadcast has the lowest coupling Stream of events Do whatever I like J
  67. 67. It decentralizes responsibility for query processing
  68. 68. Orders Service Stock Service Kafka Fulfilment Service UI Service Fraud Service Centralizing immutable data doesn’t affect coupling! Golden Copy Do whatever I like J
  69. 69. To share a database, turn it inside out!
  70. 70. Shared database Service Interfaces Log-Backed Streaming J! L! Event Broadcast Coupling (Autonomy) Accessibility Integrity (Divergence) L! L! J! J! K! J! J! J! J! L!L!
  71. 71. --------------------------- World Wide Web --------------------------- Payment Gateway FinanceFulfillment Web Query Web Command Start with an event-driven data layer, overlay SSP to (a) process events and (b) create views
  72. 72. So…
  73. 73. Good architectures have little to do with this:
  74. 74. It’s about how systems evolves over time
  75. 75. Hang simple services in a decoupled ecosystem
  76. 76. Ecosystem that embraces shared data Historical Streams
  77. 77. Break the ties of request driven approaches
  78. 78. Designing for DATA on the outside
  79. 79. Set your data free! Data should be available. Data should be under your control.
  80. 80. •  Broadcast events •  Retain them in the log •  Compose functions with a streaming engine •  Build views when you need to Event Driven Services
  81. 81. WIRED Principals •  Wimpy: Start simple, lightweight and fault tolerant. •  Immutable: Build a retentive, shared narrative. •  Reactive: Leverage Asynchronicity. Focus on the now. •  Evolutionary: Use only the data you need today. •  Decentralized: Receiver driven. Coordination avoiding. No God services
  82. 82. References •  Stopford: The Data Dichotomy: https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/ •  Kleppmann: Turning the Database Inside Out: https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/ •  Helland: Immutability Changes Everything: http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf •  Kreps: The Log: https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should- know-about-real-time-datas-unifying •  Narkhede: Event Sourcing, CQRS & Stream Processing: https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats- connection/
  83. 83. Twitter: @benstopford Blog of this talk at http://confluent.io/blog

×