Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Putting the Micro into Microservices with Stateful Stream Processing

2,253 views

Published on

How small can a microservice be? This talk will look at how Stateful Stream Processing is used to build truly autonomous, often minuscule services. With the distributed guarantees of Exactly Once Processing, Event Driven Services supported by Apache Kafka become reliable, fast and nimble, blurring the line between business system and big data pipeline.

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Putting the Micro into Microservices with Stateful Stream Processing

  1. 1. 1 Putting the ‘Micro’ into Microservices with Stateful Stream Processing Ben Stopford @benstopford
  2. 2. 2
  3. 3. 3 Last time: Event Driven Services •  Interact through events, don’t talk to services •  Evolve a Cascading Narrative •  Query views, built inside your bounded context. Related: -Event Collaboration -Service Choreography -Event storming
  4. 4. 4 The narrative should contain business facts It should read like: “a day in the life of your business”
  5. 5. 5 Base services on Event Sourcing Domain Service Kafka Local View of Customers Service journals state changes (Facts)
  6. 6. 6 Where stateful, reconstitute from log Domain Service Kafka Local View of Customers Reconstitute state
  7. 7. 7 Use the narrative to move away from legacy apps Domain Service Kafka Local View of Customers
  8. 8. 8 Retain the Narrative Kafka scales
  9. 9. 9 What is this talk about? Using SSP to build lightweight services that evolve the shared narrative 1.  What’s in the SSP toolbox? 2.  Patterns for Streaming Services 3.  Sewing together the Big & the Small 4.  The tricky parts
  10. 10. 10 Applying tools from stateful stream processing
  11. 11. 11 Aside: What is Stateful Stream Processing? A machine for converting a set of streams into something more useful. Output a stream or output a table
  12. 12. 12 Domain Service Kafka Local View of Customers View (1) Drive processing from Events (2) Build views that answer questions
  13. 13. 13 The tools
  14. 14. 14 Kafka Producer (1) Producer/Consumer APIs Consumer •  Simple & easy to understand •  Native in many languages •  Easy to integrate to existing services •  No joins, predicates, groupings or tables •  Includes transactions in Kafka-11.
  15. 15. 15 (2) KStreams DSL •  Joins (including windowing) •  Predicates •  Groupings •  Processor API Kafka KTable Inbound Stream (Windowed) State Store Outbound Stream Changelog Stream DSL
  16. 16. 16 (3) The KTable •  Recast a stream as a table via a defined key •  Tends to be backed by compacted topic. •  Materialized locally in RocksDB •  Provides guaranteed join semantics (i.e. no windowing concerns) Kafka KTable (in RocksDB) DSL JVM boundary
  17. 17. 17 The KTable •  Behaves like a regular stream but retains the latest value for each key, from offset 0 •  Partitions spread over all available instances •  Partition counts must match for joins •  Triggers processing •  Initializes lazily.
  18. 18. 18 (4) Global KTable •  All partitions replicated to all nodes •  Supports N-way join •  Blocks until initialized •  Doesn’t trigger processing (reference resource) Kafka Global KTable (in RocksDB) DSL
  19. 19. 19 Global-KTables vs. KTables P1,3 P6,8 P2,7 P4,5 P1,2, 3,4,5, 6,7,8 P1,2, 3,4,5, 6,7,8 P1,2, 3,4,5, 6,7,8 P1,2, 3,4,5, 6,7,8 KTable (Sharded) Global KTable (Replicated) Given a service with four instances:
  20. 20. 20 (5) State Store •  RocksDB instance •  Mutable •  Backed by Kafka •  Integrated into DSL •  Can be instantiated independently •  Queryable via Queryable state •  Useful for custom views or stateful services.Kafka State Store (RocksDB) Changelog Stream DSL
  21. 21. 21 (6) Hot Replicas -  By default KStreams keeps state backed up on a secondary node. -  Single machine failure doesn’t result in a re- hydration pause. => Highly available, stateful services. Service instance 1 Service instance 2
  22. 22. 22 (7) Transactions Two problems: 1.  Failures create duplicates •  Each message gets a unique ID •  Deduplicate on write 2.  You need to write, atomically, to 2 or more topics •  Chandy-Lamport Algorithm NB: Transactions will be in the next Kafka release
  23. 23. 23 Snapshot Marker Model Commit/Abort Begin Commit/Abort Begin Buffer* * The buffering is actually done in Kafka no the consumer Producer Consumer Topic A Topic B Elected Transaction Coordinator Consumer only sees committed messages
  24. 24. 24 Simple Example //Orders -> Stock Requests and Payments orders = ordersTopic.poll() payments = paymentsFor(orders) stockUpdates = stockUpdatesFor(orders) //Send & commit atomically producer.beginTransaction() producer.send(payments) producer.send(stockUpdates) producer.sendOffsetsToTransaction(offsets(orders)) producer.endTransaction()
  25. 25. 25 State-Stores & Transactions •  In KStreams Transactions span stream and changelog •  Processing of stream & state is atomic •  Example: counting orders. Kafka State Store Stream DSL Atomic Transaction Changelog Stream
  26. 26. 26 All these tools scale horizontally and avoid single points of failure Sources Kafka KafkaService Layer Service Layer Kafka
  27. 27. 27 Service patterns (building blocks)
  28. 28. 28 (1) Stateless Processor e.g. - Filter orders by region -  Convert to local domain model -  Simple rule
  29. 29. 29 (2) Stateless, Data Enabled Processor Similar to star schema •  Facts are stream •  Dimensions are GKTables e.g. Enrich Orders with Customer Info & Account details Stream GKTable GKTable Stream-Table Join
  30. 30. 30 (3) Gates e.g. rules engines or event correlation When Order & Payment then … Stream 1 Window Window Stream-Stream Join Stream 2
  31. 31. 31 (4) Stateful Processor e.g. - Average order amount by region - Posting Engine (reverse replace of previous position) State store backed up to Kafka
  32. 32. 32 (5) Stream-Aside Pattern State store backed up to Kafka public static void main(…){ … } JVM
  33. 33. 33 (7) Sidecar Pattern State store backed up to Kafka
  34. 34. 34 Combine them: 1.  Stateless 2.  Data enriched 3.  Gates 4.  Stateful processors 5.  Stream-aside 6.  Sidecar
  35. 35. 35 Query patterns covered in last talk
  36. 36. 36 Building ON the narrative The big and the small
  37. 37. 37 Evolutionary approach Services should: •  Fit in your head •  Should be quick to stand up & get going But sometimes bigger is the right answer
  38. 38. 38 Event Driven First •  Build on an event driven backbone •  Add Request-Driven interfaces where needed. •  Prefer intra-context REST Kafka
  39. 39. 39 Putting the Micro into Microservices - Combine simple services that develop the shared narrative - Start stateless, make stateful if necessary
  40. 40. 40 Contexts within Contexts Where services are tiny, they will typically be grouped together. •  Shared domain? •  Shared deployment? •  Shared streams? Several small services Single bounded context
  41. 41. 41 Layering within multi-service contexts Kafka Data layer: -Stream enrichment -View creation -Streams maintenance Business layer encapsulates logic, integrates with legacy code etc. Request-Driven layer provides REST / RPC / websotcket etc for UIs etc
  42. 42. 42 Kafka Different levels of data visibility within Kafka Kafka Contexts within Contexts Accounting Fulfillment Online Services
  43. 43. 43 Bringing it together
  44. 44. 44 Good architectures have little to do with this:
  45. 45. 45 It’s more about how systems evolves over time
  46. 46. 46 Sunk Cost vs Future Cost Architecture: Sunk Cost Change: Future Cost Balance
  47. 47. 47 Part 1: Data dichotomy Request-Driven Services don’t handle shared data well -  God service problem -  The data erosion problem Request-Driven Services struggle with complex flows -  the service entanglement problem -  the batch processing problem.
  48. 48. 48 Part 2: Building Event Driven Services Solution is to part 1: -  Event source facts into a Cascading Narrative -  Communication & State transfer converge -  Decentralize the role of turning facts into queryable views -  Keep these views as lightweight and disposable as possible.
  49. 49. 49 Putting the Micro in Microservices -  Start lightweight, stateless functions which compose and evolve -  Leverage the stateful stream processing toolkit to blend streams, state, views, transactions. -  Use layered contexts that segregate groups of streams and services.
  50. 50. 50 Highly scalable architecture
  51. 51. 51 All order logic, stateless service Orders-by-Customer view (KStreams + Queryable State API) UI Service Stream Maintenance & Monitoring Highly available, stateful service UI uses Orders-by- Customer View directly via Queryable State History Service pushes data to a local Elastic Search Instance Orders Service Derived View UI joins data Tables & Streams Fulfillment Service Analytics OrdersProduct Custo- mers KTables KTables KTables Schemas
  52. 52. 52 Tips & Tricky parts •  Reasoning about service choreography •  Debugging multi-stage topologies •  Monitoring/debug via the narrative •  Segregate facts from intermediary streams •  Retaining data comes at a cost •  Schema migration etc •  Accuracy over non trivial topologies •  Often requires transactions •  Transactions, causality & the flow of events •  Avoid side effects
  53. 53. 53 Event Driven Services Create a cascading narrative of events, then sew them together with stream processing.
  54. 54. 54 Discount code: kafcom17 ‪Use the Apache Kafka community discount code to get $50 off ‪www.kafka-summit.org Kafka Summit New York: May 8 Kafka Summit San Francisco: August 28 Presented by
  55. 55. 55 Thank You! @benstopford

×