Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Microservices in a
Streaming World
There are many good reasons for
building service-based systems
• Loose Coupling
• Bounded Contexts
• Autonomy
• Ease of sc...
But when we do,
we’re building a distributed system
This can be a bit tricky
Monolithic & Centralised
Approaches
Shared, mutable state
Decentralisation
Stream Processing is
a bit different
batch analytics => real time => at scale => accurately
and comes with an
interesting toolset
Stream Processing
Toolset
Business Applications
Some fundamental
patterns of distributed
systems
Request / Response
Mediator / Workflow
Request/Response
Event Driven
Async / Fire and Forget
Event Based
• Simple
• Synchronous
• Event Driven
• Good decoupling
• Requires Broker
• Fire & Forget
• Polling
• Full dec...
SOA / Microservices
Message Broker
Event Based Request/Response
Combinations
Event-
Based
Request/
Response
Combinations
Withdraw 

£100
Account

Service
General

Ledger
Customer

Statements
Fraud

Detection
Check 

Funds
Async Me...
Services generally eschew
shared, mutable state
How do we put these
things together?
Request/Response
Request/Response
Request
Response
ReST
Request/Response
+ Registry
Registry
Request
Response
ReST
Asynchronous and
Event-Based
Communication
Queues
Point to Point
Service A Service B
Load Balancing
Instance 2
Instance 1
Single message allocation has scalability issues
Batched Allocation
Instance 1
Instance 2
Throughput!
Lose Ordering Guarantees
Fail!
Instance 1
Instance 2
Topics
Topics are Broadcast
Consumer
Consumer
Broker broadcast
Topics Retain Ordering
Trades
Buys
Sells
Broker Instance 1
Instance 2
Even when services fail
Trades
Buys
Sells
Fail!
Broker
We retain ordering, but we have to detect & reprovision
Instance 1
...
A Few Implications
Queues Lose Ordering
Guarantees at Scale
Fail!
Worker 1
Worker 2
Trades
Buys
Sells
Topics don’t provide
availability
Broker
Trades
Buys
Sells
Messages are Transient
Broker
Is there another way?
A Distributed Log
Kafka is one example
Think back to the queue
example
Batch
Batch
Shard on the way in
Each shard is a queue
Strong Ordering (in shard). Good concurrency.
Each consuming service is assigned
a “personal set” of queues
each little queue is sent to only one service in a group
Services instances naturally
rebalance on failure
Service instance dies, data is redirected,
ordering guarantees remain
Very Scalable, Very High
Throughput
Sharded In, Sharded Out
Reduces to a globally
ordered queue
Fault Tolerance
The Log
Single seek & scan
Append
only
messages don’t need to be transient!
Cleaning the Log
Delete old segments
Cleaning the Log
Delete old versions that share the same key
K1
K1
K1
K2
K2
K2
K1
V1
V1
V2
V3
V2
V4
V3
• Scalable multiprocessing
• Strong partition-based ordering
• Efficient data retention
• Always on
So how is this useful
for microservices?
Build ‘Always On’ Services
Rely on Fault
Tolerant Broker
Load Balance Services
Load Balance Services
(with strong ordering)
Fault Tolerant Services
Services automatically
fail over
(retaining ordering)
Services can return back to
old messages in the log
Rewind & Replay
Compacted Topics are
Interesting
K1
K1
K1
K2
K2
K2
K1
V1
V1
V2
V3
V2
V4
V3
Lets take a little
example
Getting Exchange Rates
Exchange

Rate

Service
USD/GBP = 0.71
EUR/GBP = 0.77
USD/INR = 67.7
USD/AUD = 1.38
EUR/JPY = 114.4...
Option1: Request Response
rate for USD/GBP?
0.71
Exchange

Rate

Service
I 

need 

exchange 

rates!
Option 2: Publish Subscribe
Exchange

Rate

Service
Accumulate current state
ETL
I

need 

exchange 

rates!
Option 3: Accumulate in
Compacted Stream
Exchange

Rate

Service
Get all exchange
rates
Publish to clients
USD/GBP = 0.71
...
Is it a stream or is it a table?
transitory stateful
Datasets can live in the broker!
trades books
risk results
ex-

rates
Service Backbone
Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Stateful
… lets add in stream
processing
Max(price)

From orders

where ccy=‘GBP’

over 1 day window

emitting every second
What is stream processing?
Continuous Q...
What is stream processing
engine?
Data
Index
Query

Engine
Query

Engine
vs
Database
Finite, well defined source
Stream Pro...
Windowing
For unordered or unpredictable streams
Sliding

Fixed

(tumbling)
Features: similar to
database query engine
JoinFilter
Aggr-

egate
View
Window
KStreams & KTables
stream
Compacted

stream
Join
Streaming Data
Stored Data
KStream
KTable
A little example…
Buying Lunch Abroad
Payments

Service
Exchange

Rates

Service
Buy
Notification 

Service

Amount in ££
$$
$$
Text Message:...
Request-Response Option
Payments

Service
Exchange

Rates

Service
Buy
Amount in ££
Join etc
Text Message: ££
Iterative jo...
ETL Option
Payments

Service
Exchange

Rates

Service
Buy
Amount in ££
ETL
ETL
Join etc
Text Message: ££
Stream Processor Option
Payments

Service
Exchange

Rates

Service
Buy
Stream

Processor
join

etc
Text Message: ££
Buying Lunch Abroad
Payments
Exchange

Rates
Looks like

a table

(compacted

stream)
Looks like 

an infinite 

stream
KSt...
Buying Lunch Abroad
Payments
Exchange

Rates
• Filter(ccy<>’GBP’)
• Join on ccy
• Calculate GBP
• Send text message
buffer...
Local DB (fast joins)
Topic
Compacted

Topic
KStream
pre-populate
KTables can also be written to
- they’re backed by the broker
Manage
intermediary
state
KStream
KTable
Topic
Compacted

To...
Scales Out (MPP)
These tools are pretty
handy
for managing decentralised services
Talk our own data model
Data

Stream
View
Query
Handle Unpredictability
9am 5pm
Late trades
Joining Services
Payments
Exchange

Rates
Join
Duality between Stream and
Table
Join
KStream
KTable
More Complex Use Cases
Trades Valuations
Books Customers
General
Ledger
trades books
risk results
ex-

rates
Practical mechanism for managing data
intensive, loosely coupled services
• Stateful ...
There is much more to
stream processing
it is grounded in the world of big-data analytics
Simple Approaches
Just a library (over Kafka)
Keeping Services
Consistent
Big Global Bag of 

State in the Sky
Problem: No BGBSS
How to you provide the
accuracy of this
In this?
Centralised vs Federated
Centralised
consistency model
Distributed
consistency model
One problem is failure
Duplicate messages are
inevitable
have I seen
this before?
Make Services Idempotent
try 1
try 2
try 3
try 4
Stream processors have to
solve this problem
Exactly Once
not available in Kafka… yet
So what do we have?
Use Both Approaches
Event-
Based
Request/
Response
Queued Delivery System
Ordered queue
Scales Horizontally
Scales Horizontally
Scales Horizontally
Scales Horizontally
Built In Fault Tolerance
Runs Always On
For Services Too
Scales Horizontally
Load Balance
continue through failure
Scales Horizontally
with history stored in the Log
Scales Horizontally
Extending to any number of
services
Scales Horizontally
With any data throughput
Scales Horizontally
With any data throughput
Scales Horizontally
With any data throughput
Scales Horizontally
powerful tools
for slicing and
dicing streams
Scales Horizontally
the declarative
processing of
data
join
filter
aggregate
at any
throughput
Scales Horizontally
leveraging
fast local
persistence
Scales Horizontally
backed up to
the log
Scales Horizontally
easily join
streaming
services
Blend
KStreams
and KTables
trades books
risk results
ex-

rates
with data living
in the stream
but retaining
loose coupling
trades books
risk results
ex-

rates
Scales Horizontally
with strong ordering and
repeatability guarantees
(eventually)
so…
Microservices push us away
from shared, mutable state
Big Global Bag of 

State in the Sky
Away from BGBSS’s
This means data is
increasingly remote
Sure, you can collect it all
copy copy
copy
copy
copy
copy
copy
ETL
ETL
ETL ETL
ETL
ETL
can be a lot of work
Or you can look it all up
get
get
get
get
get
get
get
get, get, 

get, get
but that doesn’t scale well
(with system complexity or with data throughput)
Better to embrace
decentralistion
We need a decentralised
toolset to do this
trades books
risk results
ex-

rates
Keep it simple,
Keep it moving
Microservices for a Streaming World
Upcoming SlideShare
Loading in …5
×

Microservices for a Streaming World

9,575 views

Published on

The world of Microservices is a little different to standard service oriented architectures. They play to an opinionated score of decentralisation, isolation and automation. Stream processing comes from a different angle though. One where analytic function is melded to a firehose of in-flight events. Yet business applications increasingly need to be data intensive, nimble and fast to market. This isn’t as easy as it sounds.

This talk will look at the implications of mixing toolsets from the stream processing world into realtime business applications. How to effectively handle infinite streams, how to leverage a high throughput, persistent Log and deploy dynamic, fault tolerant, streaming services. By mixing these disparate domains we’ll see how we can build application architectures that can withstand the full force and immediacy of the data generation.

Published in: Software
  • Be the first to comment

Microservices for a Streaming World

  1. 1. Microservices in a Streaming World
  2. 2. There are many good reasons for building service-based systems • Loose Coupling • Bounded Contexts • Autonomy • Ease of scaling • Composability
  3. 3. But when we do, we’re building a distributed system
  4. 4. This can be a bit tricky
  5. 5. Monolithic & Centralised Approaches Shared, mutable state
  6. 6. Decentralisation
  7. 7. Stream Processing is a bit different batch analytics => real time => at scale => accurately
  8. 8. and comes with an interesting toolset
  9. 9. Stream Processing Toolset Business Applications
  10. 10. Some fundamental patterns of distributed systems
  11. 11. Request / Response
  12. 12. Mediator / Workflow Request/Response
  13. 13. Event Driven Async / Fire and Forget
  14. 14. Event Based • Simple • Synchronous • Event Driven • Good decoupling • Requires Broker • Fire & Forget • Polling • Full decoupling Request/Response vs.
  15. 15. SOA / Microservices Message Broker Event Based Request/Response
  16. 16. Combinations Event- Based Request/ Response
  17. 17. Combinations Withdraw £100 Account Service General Ledger Customer Statements Fraud Detection Check Funds Async Message Broker I need moneyReST
  18. 18. Services generally eschew shared, mutable state
  19. 19. How do we put these things together?
  20. 20. Request/Response
  21. 21. Request/Response Request Response ReST
  22. 22. Request/Response + Registry Registry Request Response ReST
  23. 23. Asynchronous and Event-Based Communication
  24. 24. Queues
  25. 25. Point to Point Service A Service B
  26. 26. Load Balancing Instance 2 Instance 1 Single message allocation has scalability issues
  27. 27. Batched Allocation Instance 1 Instance 2 Throughput!
  28. 28. Lose Ordering Guarantees Fail! Instance 1 Instance 2
  29. 29. Topics
  30. 30. Topics are Broadcast Consumer Consumer Broker broadcast
  31. 31. Topics Retain Ordering Trades Buys Sells Broker Instance 1 Instance 2
  32. 32. Even when services fail Trades Buys Sells Fail! Broker We retain ordering, but we have to detect & reprovision Instance 1 Instance 2
  33. 33. A Few Implications
  34. 34. Queues Lose Ordering Guarantees at Scale Fail! Worker 1 Worker 2
  35. 35. Trades Buys Sells Topics don’t provide availability Broker
  36. 36. Trades Buys Sells Messages are Transient Broker
  37. 37. Is there another way?
  38. 38. A Distributed Log Kafka is one example
  39. 39. Think back to the queue example Batch Batch
  40. 40. Shard on the way in
  41. 41. Each shard is a queue Strong Ordering (in shard). Good concurrency.
  42. 42. Each consuming service is assigned a “personal set” of queues each little queue is sent to only one service in a group
  43. 43. Services instances naturally rebalance on failure Service instance dies, data is redirected, ordering guarantees remain
  44. 44. Very Scalable, Very High Throughput Sharded In, Sharded Out
  45. 45. Reduces to a globally ordered queue
  46. 46. Fault Tolerance
  47. 47. The Log Single seek & scan Append only messages don’t need to be transient!
  48. 48. Cleaning the Log Delete old segments
  49. 49. Cleaning the Log Delete old versions that share the same key K1 K1 K1 K2 K2 K2 K1 V1 V1 V2 V3 V2 V4 V3
  50. 50. • Scalable multiprocessing • Strong partition-based ordering • Efficient data retention • Always on
  51. 51. So how is this useful for microservices?
  52. 52. Build ‘Always On’ Services Rely on Fault Tolerant Broker
  53. 53. Load Balance Services Load Balance Services (with strong ordering)
  54. 54. Fault Tolerant Services Services automatically fail over (retaining ordering)
  55. 55. Services can return back to old messages in the log Rewind & Replay
  56. 56. Compacted Topics are Interesting K1 K1 K1 K2 K2 K2 K1 V1 V1 V2 V3 V2 V4 V3
  57. 57. Lets take a little example
  58. 58. Getting Exchange Rates Exchange Rate Service USD/GBP = 0.71 EUR/GBP = 0.77 USD/INR = 67.7 USD/AUD = 1.38 EUR/JPY = 114.41 … I need exchange rates!
  59. 59. Option1: Request Response rate for USD/GBP? 0.71 Exchange Rate Service I need exchange rates!
  60. 60. Option 2: Publish Subscribe Exchange Rate Service Accumulate current state ETL I need exchange rates!
  61. 61. Option 3: Accumulate in Compacted Stream Exchange Rate Service Get all exchange rates Publish to clients USD/GBP = 0.71 EUR/GBP = 0.77 USD/INR = 67.7 USD/AUD = 1.38 EUR/JPY = 114.41 … Broker retains latest versions Publish all rate events
  62. 62. Is it a stream or is it a table? transitory stateful
  63. 63. Datasets can live in the broker! trades books risk results ex- rates
  64. 64. Service Backbone Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Stateful
  65. 65. … lets add in stream processing
  66. 66. Max(price) From orders where ccy=‘GBP’ over 1 day window emitting every second What is stream processing? Continuous Queries.
  67. 67. What is stream processing engine? Data Index Query Engine Query Engine vs Database Finite, well defined source Stream Processor Infinite, poorly defined source
  68. 68. Windowing For unordered or unpredictable streams Sliding Fixed (tumbling)
  69. 69. Features: similar to database query engine JoinFilter Aggr- egate View Window
  70. 70. KStreams & KTables stream Compacted stream Join Streaming Data Stored Data KStream KTable
  71. 71. A little example…
  72. 72. Buying Lunch Abroad Payments Service Exchange Rates Service Buy Notification Service Amount in ££ $$ $$ Text Message: ££ $$
  73. 73. Request-Response Option Payments Service Exchange Rates Service Buy Amount in ££ Join etc Text Message: ££ Iterative join over the network
  74. 74. ETL Option Payments Service Exchange Rates Service Buy Amount in ££ ETL ETL Join etc Text Message: ££
  75. 75. Stream Processor Option Payments Service Exchange Rates Service Buy Stream Processor join etc Text Message: ££
  76. 76. Buying Lunch Abroad Payments Exchange Rates Looks like a table (compacted stream) Looks like an infinite stream KStream KTable
  77. 77. Buying Lunch Abroad Payments Exchange Rates • Filter(ccy<>’GBP’) • Join on ccy • Calculate GBP • Send text message buffering
  78. 78. Local DB (fast joins) Topic Compacted Topic KStream pre-populate
  79. 79. KTables can also be written to - they’re backed by the broker Manage intermediary state KStream KTable Topic Compacted Topic
  80. 80. Scales Out (MPP)
  81. 81. These tools are pretty handy for managing decentralised services
  82. 82. Talk our own data model Data Stream View Query
  83. 83. Handle Unpredictability 9am 5pm Late trades
  84. 84. Joining Services Payments Exchange Rates Join
  85. 85. Duality between Stream and Table Join KStream KTable
  86. 86. More Complex Use Cases Trades Valuations Books Customers General Ledger
  87. 87. trades books risk results ex- rates Practical mechanism for managing data intensive, loosely coupled services • Stateful streams live inside the Log • Data extracted quickly! • Fast, local joins, over large datasets • HA pre-caching • Manage intermediary state • Just a simple library (over Kafka)
  88. 88. There is much more to stream processing it is grounded in the world of big-data analytics
  89. 89. Simple Approaches Just a library (over Kafka)
  90. 90. Keeping Services Consistent
  91. 91. Big Global Bag of State in the Sky Problem: No BGBSS
  92. 92. How to you provide the accuracy of this
  93. 93. In this?
  94. 94. Centralised vs Federated Centralised consistency model Distributed consistency model
  95. 95. One problem is failure
  96. 96. Duplicate messages are inevitable have I seen this before?
  97. 97. Make Services Idempotent try 1 try 2 try 3 try 4
  98. 98. Stream processors have to solve this problem
  99. 99. Exactly Once not available in Kafka… yet
  100. 100. So what do we have?
  101. 101. Use Both Approaches Event- Based Request/ Response
  102. 102. Queued Delivery System Ordered queue
  103. 103. Scales Horizontally
  104. 104. Scales Horizontally
  105. 105. Scales Horizontally
  106. 106. Scales Horizontally
  107. 107. Built In Fault Tolerance
  108. 108. Runs Always On
  109. 109. For Services Too
  110. 110. Scales Horizontally Load Balance
  111. 111. continue through failure
  112. 112. Scales Horizontally with history stored in the Log
  113. 113. Scales Horizontally Extending to any number of services
  114. 114. Scales Horizontally With any data throughput
  115. 115. Scales Horizontally With any data throughput
  116. 116. Scales Horizontally With any data throughput
  117. 117. Scales Horizontally powerful tools for slicing and dicing streams
  118. 118. Scales Horizontally the declarative processing of data join filter aggregate
  119. 119. at any throughput
  120. 120. Scales Horizontally leveraging fast local persistence
  121. 121. Scales Horizontally backed up to the log
  122. 122. Scales Horizontally easily join streaming services
  123. 123. Blend KStreams and KTables
  124. 124. trades books risk results ex- rates with data living in the stream
  125. 125. but retaining loose coupling trades books risk results ex- rates
  126. 126. Scales Horizontally with strong ordering and repeatability guarantees (eventually)
  127. 127. so…
  128. 128. Microservices push us away from shared, mutable state
  129. 129. Big Global Bag of State in the Sky Away from BGBSS’s
  130. 130. This means data is increasingly remote
  131. 131. Sure, you can collect it all copy copy copy copy copy copy copy ETL ETL ETL ETL ETL ETL
  132. 132. can be a lot of work
  133. 133. Or you can look it all up get get get get get get get get, get, get, get
  134. 134. but that doesn’t scale well (with system complexity or with data throughput)
  135. 135. Better to embrace decentralistion
  136. 136. We need a decentralised toolset to do this trades books risk results ex- rates
  137. 137. Keep it simple, Keep it moving

×