Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
The Data Dichotomy:
Rethinking data and services
with streams
Ben Stopford
@benstopford
2
Build
Features
Build for
the Future
3
Evolution!
4
KAFKA
Serving
Layer
(Cassandra etc.)
Kafka Streams /
KSQL
Streaming Platforms
Data is embedded in
each engine
High Throu...
5
authorization_attempts possible_fraud
Streaming Example
6
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)...
7
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)...
8
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)...
9
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE)...
10
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE...
11
CREATE STREAM possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTE...
12
Streaming == Manipulating Data in Flight
13
Business
Applications
14
EcosystemsApp
Increasingly we build ecosystems
15
SOA / Microservices / EDA
Customer
Service
Shipping
Service
16
The Problem is DATA
17
Most services share the same core facts.
Catalog
Most services live
in here
18
Orders
Service
Payments
Service
Customers
Service
Data becomes spread out and we need to
bring it together
Useful Grid
19
Service A Service B
Service C
One option is to share a database
20
Service A Service B
Service C
Databases provide a very rich form of coupling
21
Two different forces
compete in our designs
22
Single Sign On Business Serviceauthorise(),
We are taught to encapsulate
LOOSE COUPLING!
23
But data systems
have little to do
with encapsulation
24
Service Database
Data on
inside
Data on
outside
Data on
inside
Data on
outside
Interface
hides data
Interface
amplifies...
25
The data dichotomy
Data systems are about exposing data.
Services are about hiding it.
26
Microservices shouldn’t
share a database!
27
Tension
We want all the good stuff which comes
with a database.
We don’t want to share that database
with anyone else.
...
28
So how do we share data between
services?
Orders
Service
Shipping
Service
Customer
Service
Webserver
29
Buying an iPad (with REST)
Submit
Order
shipOrder() getCustomer()
Orders
Service
Shipping
Service
Customer
Service
Webs...
30
Buying an iPad with Events
Message Broker (Kafka)
Notification Data is
replicated
(incrementally)
Submit
Order
Order
Cr...
31
Events for Notification Only
Message Broker (Kafka)
Submit
Order
Order
Created
getCustomer()
REST
Notification
Orders
S...
32
Events for Data Locality
Customer
Updated
Submit
Order
Order
Created
Data is
replicated
Orders
Service
Shipping
Service...
33
Events have two hats
Notification Data
replication
34
Events are the key to scalable service
ecosystems
35
Streaming is the toolset for dealing with
events as they move!
36
Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
37
Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
38
What is a Distributed Log?
39
Shard on the way in
Producing
Services
Kafka
Consuming
Services
40
Each shard is a queue
Producing
Services
Kafka
Consuming
Services
41
Consumers share load
Producing
Services
Kafka
Consuming
Services
42
A log can Rewound and Replayed
Rewind & Replay
43
Compacted Log
(retains only latest version)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Versi...
44
Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
45
Kafka Connect
Kafka
Connect
Kafka
Connect
Kafka
46
Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
47
A database engine for
data-in-flight
48
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW (SIZE 5 MINUTE)
GROUP BY card_number
HAVING count(*) > ...
49
Features: similar to database
query engine
JoinFilter
Aggr-
egate
View
Window
50
Compacted
Topic
Join
Stream
Table
Kafka
Kafka Streams / KSQL
Topic
Join Streams and Tables
51
Handle Asynchronicity
In an asynchronous world, will the payment
come first, or the order?
KAFKA
Buffer 5 mins
Join by ...
52
Handle Asynchronicity
KAFKA
Buffer 5 mins
Join by Key
KStream orders = builder.stream(“Orders”);
KStream payments = bui...
53
KAFKA
Join
A KTable is just a stream with infinite
retention
54
A KTable is just a stream with infinite
retention
KStream orders = builder.stream(“Orders”);
KStream payments = builder...
55
KAFKA
Emailer
With KSQL and Node.js
Create stream ToEmail
From Orders, Payment,
Customer where …
56
Scales Out
57
Streaming is about
1. Processing data
incrementally
2. Moving data to where
it needs to be
processed (quickly
and effic...
58
Steps to Streaming Services
59
1. Take Responsibility for the past and
evolve
60
Stay Simple. Take Responsibility for the past
Browser
Webserver
61
Evolve Forwards
Browser
Webserver
Orders
Service
62
2. Raise events. Don’t talk to services.
63
Raise events. Don’t talk to services
Browser
Webserver
Orders
Service
64
KAFKA
Order
Requested
Order
Received
Browser
Webserver
Orders
Service
Raise events. Don’t talk to services
65
KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Raise events. Don’t talk to servi...
66
KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Use Kafka as a Backbone for Events
67
3. Use Connect (& CDC) to evolve away
from legacy
68
KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Evolve away from Legacy
69KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Use the Database as a ‘Seam’
Conne...
70
4. Make use of Schemas
71KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Schemas are your API
Connect
Produ...
72
5. Use the Single Writer Principal
73KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
Apply the single writer principal
...
74
Orders
Service
Email
Service
T1 T2
T3
T4
REST
Service
T5
Single Writer Principal
75
Single Writer Principal
- Creates local consistency points in
the absence of Global Consistency
- Makes schema upgrades...
76
6. Store Datasets in the Log
77
Messaging that Remembers
Orders Customers
Payments
Stock
78
KAFKA
Order
Requested
Order
Validated
Order
Received
Browser
Webserver
Orders
Service
New Service, No Problem!
Connect
...
79
Orders Customers
Payments
Stock
Single, Shared Source of Truth
80
But how do you query a log?
81
7. Move Data to Code
82
83
Connect
Order
Requested
Order
Validated
Order
Completed
Order
Received
Products
Browser
Webserver
Schema Registry
Order...
84
Connect
Order
Requested
Order
Validated
Order
Completed
Order
Received
Products
Browser
Webserver
Schema Registry
Order...
85
Data Movement
Be realistic:
• Network is no longer the bottleneck
• Indexing is:
• In memory indexes help
• Keep datase...
86
8. Use the log as a ‘database’
87
Connect
Order
Requested
Order
Validated
Order
Completed
Order
Received
Products
Browser
Webserver
Schema Registry
Order...
88
Connect
Order
Requested
Order
Validated
Order
Completed
Order
Received
Products
Browser
Webserver
Schema Registry
Order...
89
Kafka has several features for reducing
the need to move data on startup
- Standby Replicas
- Disk Checkpoints
- Compac...
90
9. Use Transactions to tie All
Interactions Together
91
OrderRequested
(IPad)
2a. Order Validated
2c. Offset Commit
2b. IPad Reserved
Internal State:
Stock = 17
Reservations =...
92
Connect
TRANSACTION
Order
Requested
Order
Validated
Order
Completed
Order
Received
Products
Browser
Webserver
Schema Re...
93
10. Bridge the Sync/Async Divide with a
Streaming Ecosystem
94
POST
GET
Load
Balancer
ORDERSORDERS
OVTOPIC
Order
Validations
KAFKA
INVENTORY
Orders
Inventory
Fraud
Service
Order
Deta...
95
Orders Customers
Payments
Stock
Each service is optimized for autonomy
A Database Inside Out
HISTORICAL
EVENT STREAMS
96
Kafka
KAFKA
New York
Tokyo
London
Global / Disconnected Ecosystems
97
So…
98
Good architectures have little to do
with this:
99
It’s about how systems evolves over time
100
Request driven isn’t enough
• High coupling
• Hard to handle
async flows
• Hard to move and
join datasets.
101
Leverage the Duality of Events
Notification Data
replication
102
With a toolset built for data in flight
103
The data dichotomy
Data systems are about exposing data.
Services are about hiding it.
Remember the data dichotomy
104
The Data Dichotomy
We want all the good stuff which comes
with a database.
We don’t want to share that database
with a...
105
• Broadcast events
• Retain them in the log
• Compose streaming functions
• Recasting the event stream into
views when...
106
Services built on
a Streaming
Platform
107
Thank You
@benstopford
Blog Series: https://www.confluent.io/blog/tag/microservices/
Code: https://github.com/confluen...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What to Upload to SlideShare
Next
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams

Download to read offline

When building service-based systems, we don’t generally think too much about data. If we need data from another service, we ask for it. This pattern works well for whole swathes of use cases, particularly ones where datasets are small and requirements are simple. But real business services have to join and operate on datasets from many different sources. This can be slow and cumbersome in practice.

These problems stem from an underlying dichotomy. Data systems are built to make data as accessible as possible—a mindset that focuses on getting the job done. Services, instead, focus on encapsulation—a mindset that allows independence and autonomy as we evolve and grow. But these two forces inevitably compete in most serious service-based architectures.

Ben Stopford explains why understanding and accepting this dichotomy is an important part of designing service-based systems at any significant scale. Ben looks at how companies make use of a shared, immutable sequence of records to balance data that sits inside their services with data that is shared, an approach that allows the likes of Uber, Netflix, and LinkedIn to scale to millions of events per second.

Ben concludes by examining the potential of stream processors as a mechanism for joining significant, event-driven datasets across a whole host of services and explains why stream processing provides much of the benefits of data warehousing but without the same degree of centralization.

Related Books

Free with a 30 day trial from Scribd

See all

NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams

  1. 1. 1 The Data Dichotomy: Rethinking data and services with streams Ben Stopford @benstopford
  2. 2. 2 Build Features Build for the Future
  3. 3. 3 Evolution!
  4. 4. 4 KAFKA Serving Layer (Cassandra etc.) Kafka Streams / KSQL Streaming Platforms Data is embedded in each engine High Throughput Messaging Clustered Java App
  5. 5. 5 authorization_attempts possible_fraud Streaming Example
  6. 6. 6 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  7. 7. 7 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  8. 8. 8 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  9. 9. 9 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  10. 10. 10 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  11. 11. 11 CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; authorization_attempts possible_fraud
  12. 12. 12 Streaming == Manipulating Data in Flight
  13. 13. 13 Business Applications
  14. 14. 14 EcosystemsApp Increasingly we build ecosystems
  15. 15. 15 SOA / Microservices / EDA Customer Service Shipping Service
  16. 16. 16 The Problem is DATA
  17. 17. 17 Most services share the same core facts. Catalog Most services live in here
  18. 18. 18 Orders Service Payments Service Customers Service Data becomes spread out and we need to bring it together Useful Grid
  19. 19. 19 Service A Service B Service C One option is to share a database
  20. 20. 20 Service A Service B Service C Databases provide a very rich form of coupling
  21. 21. 21 Two different forces compete in our designs
  22. 22. 22 Single Sign On Business Serviceauthorise(), We are taught to encapsulate LOOSE COUPLING!
  23. 23. 23 But data systems have little to do with encapsulation
  24. 24. 24 Service Database Data on inside Data on outside Data on inside Data on outside Interface hides data Interface amplifies data Databases amplify the data they hold
  25. 25. 25 The data dichotomy Data systems are about exposing data. Services are about hiding it.
  26. 26. 26 Microservices shouldn’t share a database!
  27. 27. 27 Tension We want all the good stuff which comes with a database. We don’t want to share that database with anyone else. But we do want to share datasets in a sensible way.
  28. 28. 28 So how do we share data between services? Orders Service Shipping Service Customer Service Webserver
  29. 29. 29 Buying an iPad (with REST) Submit Order shipOrder() getCustomer() Orders Service Shipping Service Customer Service Webserver
  30. 30. 30 Buying an iPad with Events Message Broker (Kafka) Notification Data is replicated (incrementally) Submit Order Order Created Customer Updated Orders Service Shipping Service Customer Service Webserver KAFKA
  31. 31. 31 Events for Notification Only Message Broker (Kafka) Submit Order Order Created getCustomer() REST Notification Orders Service Shipping Service Customer Service Webserver KAFKA
  32. 32. 32 Events for Data Locality Customer Updated Submit Order Order Created Data is replicated Orders Service Shipping Service Customer Service Webserver KAFKA
  33. 33. 33 Events have two hats Notification Data replication
  34. 34. 34 Events are the key to scalable service ecosystems
  35. 35. 35 Streaming is the toolset for dealing with events as they move!
  36. 36. 36 Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  37. 37. 37 Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  38. 38. 38 What is a Distributed Log?
  39. 39. 39 Shard on the way in Producing Services Kafka Consuming Services
  40. 40. 40 Each shard is a queue Producing Services Kafka Consuming Services
  41. 41. 41 Consumers share load Producing Services Kafka Consuming Services
  42. 42. 42 A log can Rewound and Replayed Rewind & Replay
  43. 43. 43 Compacted Log (retains only latest version) Version 3 Version 2 Version 1 Version 2 Version 1 Version 5 Version 4 Version 3 Version 2 Version 1
  44. 44. 44 Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  45. 45. 45 Kafka Connect Kafka Connect Kafka Connect Kafka
  46. 46. 46 Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  47. 47. 47 A database engine for data-in-flight
  48. 48. 48 SELECT card_number, count(*) FROM authorization_attempts WINDOW (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3; Continuously Running Queries
  49. 49. 49 Features: similar to database query engine JoinFilter Aggr- egate View Window
  50. 50. 50 Compacted Topic Join Stream Table Kafka Kafka Streams / KSQL Topic Join Streams and Tables
  51. 51. 51 Handle Asynchronicity In an asynchronous world, will the payment come first, or the order? KAFKA Buffer 5 mins Join by Key
  52. 52. 52 Handle Asynchronicity KAFKA Buffer 5 mins Join by Key KStream orders = builder.stream(“Orders”); KStream payments = builder.stream(“Payments”); orders.join(payments, KeyValue::new, JoinWindows.of(1 * MIN)) .peek((key, pair) -> emailer.sendMail(pair));
  53. 53. 53 KAFKA Join A KTable is just a stream with infinite retention
  54. 54. 54 A KTable is just a stream with infinite retention KStream orders = builder.stream(“Orders”); KStream payments = builder.stream(“Payments”); KTable customers = builder.table(“Customers”); orders.join(payments, EmailTuple::new, JoinWindows.of(1*MIN)) .join(customers, (tuple, cust) -> tuple.setCust(cust)) .peek((key, tuple) -> emailer.sendMail(tuple)); Materialize a table in two lines of code!
  55. 55. 55 KAFKA Emailer With KSQL and Node.js Create stream ToEmail From Orders, Payment, Customer where …
  56. 56. 56 Scales Out
  57. 57. 57 Streaming is about 1. Processing data incrementally 2. Moving data to where it needs to be processed (quickly and efficiently) On Notification Data Replication
  58. 58. 58 Steps to Streaming Services
  59. 59. 59 1. Take Responsibility for the past and evolve
  60. 60. 60 Stay Simple. Take Responsibility for the past Browser Webserver
  61. 61. 61 Evolve Forwards Browser Webserver Orders Service
  62. 62. 62 2. Raise events. Don’t talk to services.
  63. 63. 63 Raise events. Don’t talk to services Browser Webserver Orders Service
  64. 64. 64 KAFKA Order Requested Order Received Browser Webserver Orders Service Raise events. Don’t talk to services
  65. 65. 65 KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Raise events. Don’t talk to services
  66. 66. 66 KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Use Kafka as a Backbone for Events
  67. 67. 67 3. Use Connect (& CDC) to evolve away from legacy
  68. 68. 68 KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Evolve away from Legacy
  69. 69. 69KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Use the Database as a ‘Seam’ Connect Products
  70. 70. 70 4. Make use of Schemas
  71. 71. 71KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Schemas are your API Connect Products Schema Registry
  72. 72. 72 5. Use the Single Writer Principal
  73. 73. 73KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service Apply the single writer principal Connect Products Schema Registry Order Completed
  74. 74. 74 Orders Service Email Service T1 T2 T3 T4 REST Service T5 Single Writer Principal
  75. 75. 75 Single Writer Principal - Creates local consistency points in the absence of Global Consistency - Makes schema upgrades easier to manage.
  76. 76. 76 6. Store Datasets in the Log
  77. 77. 77 Messaging that Remembers Orders Customers Payments Stock
  78. 78. 78 KAFKA Order Requested Order Validated Order Received Browser Webserver Orders Service New Service, No Problem! Connect Products Schema Registry Order Completed Repricing
  79. 79. 79 Orders Customers Payments Stock Single, Shared Source of Truth
  80. 80. 80 But how do you query a log?
  81. 81. 81 7. Move Data to Code
  82. 82. 82
  83. 83. 83 Connect Order Requested Order Validated Order Completed Order Received Products Browser Webserver Schema Registry Orders Service Stock Stock Materialize Stock ‘View’ Inside Service KAFKA
  84. 84. 84 Connect Order Requested Order Validated Order Completed Order Received Products Browser Webserver Schema Registry Orders Service Stock Stock Take only the data we need KAFKA
  85. 85. 85 Data Movement Be realistic: • Network is no longer the bottleneck • Indexing is: • In memory indexes help • Keep datasets focused
  86. 86. 86 8. Use the log as a ‘database’
  87. 87. 87 Connect Order Requested Order Validated Order Completed Order Received Products Browser Webserver Schema Registry Orders Service Reserved Stocks Stock Stock Reserved Stocks Apply Event Sourcing KAFKA Table
  88. 88. 88 Connect Order Requested Order Validated Order Completed Order Received Products Browser Webserver Schema Registry Orders Service Reserved Stocks Stock Stock Reserved Stocks Order Service Loads Reserved Stocks on Startup KAFKA
  89. 89. 89 Kafka has several features for reducing the need to move data on startup - Standby Replicas - Disk Checkpoints - Compacted topics
  90. 90. 90 9. Use Transactions to tie All Interactions Together
  91. 91. 91 OrderRequested (IPad) 2a. Order Validated 2c. Offset Commit 2b. IPad Reserved Internal State: Stock = 17 Reservations = 2 Tie Events & State with Transactions
  92. 92. 92 Connect TRANSACTION Order Requested Order Validated Order Completed Order Received Products Browser Webserver Schema Registry Orders Service Reserved Stocks Stock Stock Reserved Stocks Transactions KAFKA
  93. 93. 93 10. Bridge the Sync/Async Divide with a Streaming Ecosystem
  94. 94. 94 POST GET Load Balancer ORDERSORDERS OVTOPIC Order Validations KAFKA INVENTORY Orders Inventory Fraud Service Order Details Service Inventory Service (see previous figure) Order Created Order Validated Orders View Q in CQRS Orders Service C is CQRS Services in the Micro: Orders Service Find the code online!
  95. 95. 95 Orders Customers Payments Stock Each service is optimized for autonomy A Database Inside Out HISTORICAL EVENT STREAMS
  96. 96. 96 Kafka KAFKA New York Tokyo London Global / Disconnected Ecosystems
  97. 97. 97 So…
  98. 98. 98 Good architectures have little to do with this:
  99. 99. 99 It’s about how systems evolves over time
  100. 100. 100 Request driven isn’t enough • High coupling • Hard to handle async flows • Hard to move and join datasets.
  101. 101. 101 Leverage the Duality of Events Notification Data replication
  102. 102. 102 With a toolset built for data in flight
  103. 103. 103 The data dichotomy Data systems are about exposing data. Services are about hiding it. Remember the data dichotomy
  104. 104. 104 The Data Dichotomy We want all the good stuff which comes with a database. We don’t want to share that database with anyone else. But we do want to share datasets in a sensible way.
  105. 105. 105 • Broadcast events • Retain them in the log • Compose streaming functions • Recasting the event stream into views when you need to query. Event Driven Services
  106. 106. 106 Services built on a Streaming Platform
  107. 107. 107 Thank You @benstopford Blog Series: https://www.confluent.io/blog/tag/microservices/ Code: https://github.com/confluentinc/kafka-streams-examples
  • ZanaPekmez

    Oct. 3, 2021
  • bokas

    Jun. 27, 2018

When building service-based systems, we don’t generally think too much about data. If we need data from another service, we ask for it. This pattern works well for whole swathes of use cases, particularly ones where datasets are small and requirements are simple. But real business services have to join and operate on datasets from many different sources. This can be slow and cumbersome in practice. These problems stem from an underlying dichotomy. Data systems are built to make data as accessible as possible—a mindset that focuses on getting the job done. Services, instead, focus on encapsulation—a mindset that allows independence and autonomy as we evolve and grow. But these two forces inevitably compete in most serious service-based architectures. Ben Stopford explains why understanding and accepting this dichotomy is an important part of designing service-based systems at any significant scale. Ben looks at how companies make use of a shared, immutable sequence of records to balance data that sits inside their services with data that is shared, an approach that allows the likes of Uber, Netflix, and LinkedIn to scale to millions of events per second. Ben concludes by examining the potential of stream processors as a mechanism for joining significant, event-driven datasets across a whole host of services and explains why stream processing provides much of the benefits of data warehousing but without the same degree of centralization.

Views

Total views

4,284

On Slideshare

0

From embeds

0

Number of embeds

3,496

Actions

Downloads

30

Shares

0

Comments

0

Likes

2

×