Successfully reported this slideshow.

Devoxx London 2017 - Rethinking Services With Stateful Streams

12

Share

YouTube videos are no longer supported on SlideShare

View original on YouTube

Rethinking Services with
Stateful Streams
Ben Stopford
@benstopford
What are
microservices
really about?
1 of 85
1 of 85

Devoxx London 2017 - Rethinking Services With Stateful Streams

12

Share

Download to read offline

Microservices are one of those polarising concepts that technologists either love or hate. Splitting applications into autonomous units clearly has advantages, but larger service-based systems tend to struggle as the interactions between the services grow.

At the core of this sits a dichotomy: Data systems are designed to make data as accessible as possible. Services, on the other hand, actively encapsulate. These two forces inevitably compete in the architectures we build. By understanding this dichotomy we can better reason about how services should be sewn together. We strike a balance between the ability to adapt quickly and the loose coupling we need to retain autonomy, long term.

In this talk we'll examine how Stateful Stream Processing can be used to build Event Driven Services, using a distributed log like Apache Kafka. In doing so this Data-Dichotomy is balanced with an architecture that exhibits demonstrably better scaling properties, be it increased complexity, team size, data volume or velocity.

Microservices are one of those polarising concepts that technologists either love or hate. Splitting applications into autonomous units clearly has advantages, but larger service-based systems tend to struggle as the interactions between the services grow.

At the core of this sits a dichotomy: Data systems are designed to make data as accessible as possible. Services, on the other hand, actively encapsulate. These two forces inevitably compete in the architectures we build. By understanding this dichotomy we can better reason about how services should be sewn together. We strike a balance between the ability to adapt quickly and the loose coupling we need to retain autonomy, long term.

In this talk we'll examine how Stateful Stream Processing can be used to build Event Driven Services, using a distributed log like Apache Kafka. In doing so this Data-Dichotomy is balanced with an architecture that exhibits demonstrably better scaling properties, be it increased complexity, team size, data volume or velocity.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Devoxx London 2017 - Rethinking Services With Stateful Streams

  1. 1. Rethinking Services with Stateful Streams Ben Stopford @benstopford
  2. 2. What are microservices really about?
  3. 3. GUI UI Service Orders Service Returns Service Fulfilment Service Payment Service Stock Service Splitting the Monolith? Single Process /Code base /Deployment Many Processes /Code bases /Deployments
  4. 4. Orders Service Autonomy?
  5. 5. Orders Service Stock Service Email Service Fulfillment Service Independently Deployable Independently Deployable Independently Deployable Independently Deployable
  6. 6. Independence is where services get their value
  7. 7. Allows Scaling
  8. 8. Scaling in terms of people What happens when we grow?
  9. 9. Companies are inevitably a collection of applications They must work together to some degree
  10. 10. Interconnection is an afterthought FTP / Enterprise Messaging etc
  11. 11. Internal World External world External world External World External world Service Boundary Services Force Us To Consider The External World
  12. 12. Internal World External world External world External World External world Service Boundary External World is something we should Design For
  13. 13. Independence comes at a cost $$$
  14. 14. Orders Object Statement ObjectgetOpenOrders(), Less Surface Area = Less Coupling Encapsulate State Encapsulate Functionality Consider Two Objects in one address space Encapsulation => Loose Coupling
  15. 15. Change 1 Change 2 Redeploy Singly-deployable apps are easy
  16. 16. Orders Service Statement ServicegetOpenOrders(), Independently Deployable Independently Deployable Synchronized changes are painful
  17. 17. Services work best where requirements are isolated in a single bounded context
  18. 18. Single Sign On Business Serviceauthorise(), It’s unlikely that a Business Service would need the internal SSO state / function to change Single Sign On
  19. 19. SSO has a tightly bounded context
  20. 20. But business services are different
  21. 21. Catalog Authorisation Most business services share the same core datasets. Most services live in here
  22. 22. The futures of business services are far more tightly intertwined.
  23. 23. We need encapsulation to hide internal state. Be loosely coupled. But we need the freedom to slice & dice shared data like any other dataset
  24. 24. But data systems have little to do with encapsulation
  25. 25. Service Database Data on inside Data on outside Data on inside Data on outside Interface hides data Interface amplifies data Databases amplify the data they hold
  26. 26. The data dichotomy Data systems are about exposing data. Services are about hiding it.
  27. 27. Microservices shouldn’t share a database Good Advice!
  28. 28. So what do we do instead?
  29. 29. Service Interface Database Data on inside Data on outside We wrap a database in a service interface
  30. 30. One of two things happens next
  31. 31. Service Interface Database Either (1) we constantly add to the interface, as datasets grow getOpenOrders( fulfilled=false, deliveryLocation=CA, orderValue=100, operator=GreatherThan)
  32. 32. getOrder(Id) getOrder(UserId) getAllOpenOrders() getAllOrdersUnfulfilled(Id) getAllOrders() (1) Services can end up looking like kookie home-grown databases
  33. 33. ...and DATA amplifies this “Data-Service” problem
  34. 34. (2) Give up and move whole datasets en masse getAllOrders() Data Copied Internally Data Service Business Service
  35. 35. This leads to a different problem
  36. 36. Data diverges over time Orders Service Stock Service Email Service Fulfill- ment Service
  37. 37. The more mutable copies, the more data will diverge over time
  38. 38. Nice neat services Service contract too limiting Can we change both services together, easily? NO Broaden Contract Eek $$$ NO YES Is it a shared Database? Frack it! Just give me ALL the data Data diverges. (many different versions of the same facts) Lets encapsulate Lets centralise to one copy YES Start here Cycle of inadequacy:
  39. 39. These forces compete in the systems we build Divergence Accessibility Coupling
  40. 40. Is there a better way?
  41. 41. Two core patterns of system design Request Driven Event Driven Logic
  42. 42. => a set of very different architectures Event (widget ordered) Query (Get my orders) State change No state change Request driven Event Driven Logic Aside: Commands, Events & Queries Command (order widget)
  43. 43. Request Driven -> highest Coupling Commands & Queries
  44. 44. Event Driven: Interact through Events, Don't talk to Services
  45. 45. Orders Service Bus Event Broadcast => lowest coupling Stream of events Do whatever I like J Very Independent
  46. 46. Kafka changes things a bit
  47. 47. Kafka: a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  48. 48. Service Backbone Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Retentive
  49. 49. A place to keep the data-on-the-outside
  50. 50. Orders Customers Payments Stock Events are the shared dataset Cached Views Suffice LEAVE DATA IN KAFKA LONG TERM -> Services only need caches
  51. 51. Now add Stream Processing A machine for combining and processing streams of events
  52. 52. A Query Engine inside your service Join Filter Aggr- egate View Window
  53. 53. Kafka KTable Inbound Stream (Windowed) State Store Outbound Stream Changelog Stream DSL! Built in disk overflow – A maintained cache embedded inside your service Your Service Kafka
  54. 54. Window / Tables Cached on disk in your Service stream Compacted stream Join Stream Data Stream-Tabular Data Domain Logic Tables / Views Kafka Event Driven Service
  55. 55. Avg(o.time – p.time) From orders, payment, user Group by user.region over 1 day window emitting every second Streams & Tables In Streams & Tables Out Streams Stream Processing Engine Table Derived “Table”
  56. 56. Join & Process the outputs of many different services Email Service Legacy App Orders Payments Stock KSTREAMS Kafka
  57. 57. Two core types of event driven service
  58. 58. Type 1: Event Service PaymentsOrders Stock Event Driven Service Join outputs from many other services
  59. 59. Orders Service Kafka Interactive Queries API Context Boundary Type 2: Query Service
  60. 60. So we have shared storage in the Log, and a query engine layered on top
  61. 61. Data Storage + Query Engine == Database? Data lives here KAFKA MY SERVICE Query lives here
  62. 62. Customer Alteration Service MI Fraud Fulfillment Data lives here KAFKA UI View Queries processed in each service 1 x Data Storage + n x Query == Shared Database?
  63. 63. A Database, Inside Out Historical Streams Services embed a view Services embed a view Services embed a view Services embed a view
  64. 64. Microservices shouldn’t share a database
  65. 65. But this isn’t a normal database
  66. 66. Orders Service Stock Service Kafka Fulfilment Service UI Service Fraud Service Remember: Event Broadcast has the lowest coupling Stream of events Do whatever I like J
  67. 67. It decentralizes responsibility for query processing
  68. 68. Orders Service Stock Service Kafka Fulfilment Service UI Service Fraud Service Centralizing immutable data doesn’t affect coupling! Golden Copy Do whatever I like J
  69. 69. To share a database, turn it inside out!
  70. 70. Shared database Service Interfaces Log-Backed Streaming J! L! Event Broadcast Coupling (Autonomy) Accessibility Integrity (Divergence) L! L! J! J! K! J! J! J! J! L!L!
  71. 71. --------------------------- World Wide Web --------------------------- Payment Gateway FinanceFulfillment Web Query Web Command Start with an event-driven data layer, overlay SSP to (a) process events and (b) create views
  72. 72. So…
  73. 73. Good architectures have little to do with this:
  74. 74. It’s about how systems evolves over time
  75. 75. Hang simple services in a decoupled ecosystem
  76. 76. Ecosystem that embraces shared data Historical Streams
  77. 77. Break the ties of request driven approaches
  78. 78. Designing for DATA on the outside
  79. 79. Set your data free! Data should be available. Data should be under your control.
  80. 80. •  Broadcast events •  Retain them in the log •  Compose functions with a streaming engine •  Build views when you need to Event Driven Services
  81. 81. WIRED Principals •  Wimpy: Start simple, lightweight and fault tolerant. •  Immutable: Build a retentive, shared narrative. •  Reactive: Leverage Asynchronicity. Focus on the now. •  Evolutionary: Use only the data you need today. •  Decentralized: Receiver driven. Coordination avoiding. No God services
  82. 82. References •  Stopford: The Data Dichotomy: https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/ •  Kleppmann: Turning the Database Inside Out: https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/ •  Helland: Immutability Changes Everything: http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf •  Kreps: The Log: https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should- know-about-real-time-datas-unifying •  Narkhede: Event Sourcing, CQRS & Stream Processing: https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats- connection/
  83. 83. Twitter: @benstopford Blog of this talk at http://confluent.io/blog

×