The art of the event streaming application: streams, stream processors and scale ( Neil Avery, Confluent) Kafka Summit SF 2019

1
The art of the event streaming application.
streams, stream processors and scale
Neil Avery,
Oﬃce of the CTO,
@avery_neil

44
“We believe that the major
contributor to this complexity in
many systems is the handling of
state and the burden that this adds
when trying to analyse and reason
about the system.”
Out of the tar pit, 2006

55
We like ‘all the things’

88
What are microservices?
Microservices are a software development
technique - a variant of the service-oriented
architecture (SOA) architectural style that
structures an application as a collection of
loosely coupled services.
https://en.wikipedia.org/wiki/Microservices

99
structures an application as a collection of
loosely coupled services.
https://en.wikipedia.org/wiki/Microservices
this is new!

13
Handling state is hard Cache?
Embedded?
Route to right instance?

14
Shifts in responsibility, redundancy

1515
What have we learned about microservices?
● Scaling is hard
● Handling state is hard
● Sharing, coordinating is hard
● Run a database in each microservice - is hard

1616
Microservices
We had it all wrong (again)
FIX: Make them asynchronous and use
streams of events
We didn’t really
know what they
were anyway

1818
Event driven architectures
aren’t new
..but the world has changed

19
New technology, requirements and expectations

2020
Events
FACT!
SOMETHING
HAPPENED!

21
An Event
records the fact that something happened
21
A good
was sold
An invoice
was issued
A payment
was made
A new customer
registered

Events
Why do you care?
Loose coupling, autonomy, evolvability, scalability, resilience, traceability, replayability
EVENT-FIRST CHANGES HOW YOU
THINK ABOUT WHAT YOU ARE BUILDING
...more importantly...

23
Store events in
..a stream..

24
Different types of event models
● Change Data Capture - CDC (database txn log)
● Time series (IoT, metrics)
● Microservices (domain events)

26
Time travel user experience?
how many users
affected?has it happened
before?
Ask many questions of the same data, again and again
time

27
Evolvability
user experience?
how many users
affected?
has it happened
before?
new old
supports data change, logic change, logic extension, schema
evolution, loose coupling, add processors, A/B path

2828
OLD: event-driven architectures
NEW: event-streaming architectures
Organise events as
streams

Kafka is a database for events
Kafka cluster
stream
processing
Kafka Streams
KSQL
Producer/Consumer

{
user: 100
type: bid
item: 389
cat: bikes/mtb
region: dc-east
}
Partitions give you horizontal scale
/bikes/ by item-id
key#
Key
space
{...}
{...}
{...}
ConsumerTopic
Partition
Partition
assignment
Stream
processor

31
Stream processing
Kafka
Streams
processor
input events
output events
...temporal reasoning...
event-driven microservice

32
It’s pretty powerful
Stream
processor
Stream
processor
Stream
processor
Topic: click-stream
Interactive query
CDC events from KTable
CDC Stream
partition
partition
partition
CQRS
Elastic

Streaming stack
subscribe(), poll(), send(),
flush(), beginTransaction(), …
KStream, KTable, filter(),
map(), flatMap(), join(),
aggregate(), transform(), …
CREATE STREAM, CREATE TABLE,
SELECT, JOIN, GROUP BY, SUM, …
KSQL UDFs
Ease of Use
Flexibility

34
KSQL
SELECT u.country, count(u.name)
FROM user-reg u
WINDOW TUMBLING (SIZE 1 MIN)
GROUP BY u.country

35
Kafka Streams (DSL)
KStream<String, String> userBids = builder.stream("stream-user-bids");
final KTable<String, Long> bidCount = userBids
.flatMapValues(value ->
Arrays.asList(pattern.split(value.toLowerCase()))
)
.groupBy((user, count) -> user)
.count();
bidCounts.toStream().to("streams-bid-count-output",
Produced.with(stringSerde, longSerde));

36
Streaming patterns
Stream
processor
STREAM
/user-reg
FILTER
SELECT users > 18
PROJECT
SELECT user.name
JOIN
SELECT u.name, a.country
from user-reg u JOIN
address a WHERE u.id = a.id
GROUP BY (TABLE)
SELECT u.country,
count(u.name) FROM user-reg u
GROUP BY u.country
WINDOW (TABLE)
SELECT u.country, count(u.name)
FROM user-reg u WINDOW TUMBLING
(SIZE 1 MIN) group by u.country
STREAM
/address

3737
Stream processors are uniquely
convergent.
Data + Processing
(sorry dba’s)

3838
All of your data
is
a stream of events

3939
Streams are your persistence model
They are also
your local
database

4040
The atomic unit for tackling complexity
Stream
processor
input events
output events
...or microservice or whatever...

4141
Stream processor == Single atomic unit
It does one thing
Like

4242
We think in terms of function
“Bounded Context”
(dataﬂow - choreography)

4343
Let’s build something….
A simple dataflow series of processors
“Payment processing”

4444
KPay looks like this:
https://github.com/confluentinc/demo-scene/tree/master/scalable-payment-processing

4545
Bounded context
“Payments”
1. Payments inﬂight
2. Account processing [debit/credit]
3. Payments conﬁrmed

46
Payments bounded context
choreography

47
Payments system: bounded context
[1] How much is being processed?
Expressed as:
- Count of payments inflight
- Total $ value processed
[2&3] Update the account balance
Expressed as:
- Debit
- Credit [4] Confirm successful payment
Expressed as:
- Total volume today
- Total $ amount today

48
Payments system: AccountProcessor
accountBalanceKTable = inflight.groupByKey()
.aggregate(
AccountBalance::new,
(key, value, aggregate) -> aggregate.handle(key, value), accountStore);
KStream<String, Payment>[] branch = inflight
.map((KeyValueMapper<String, Payment, KeyValue<String, Payment>>)
(key, value) -> {
if (value.getState() == Payment.State.debit) {
value.setStateAndId(Payment.State.credit);
} else if (value.getState() == Payment.State.credit) {
value.setStateAndId(Payment.State.complete);
}
return new KeyValue<>(value.getId(), value);
})
KTable state
(Kafka Streams)

49
Payments system: AccountBalance
public AccountBalance handle(String key, Payment value) {
this.name = value.getId();
if (value.getState() == Payment.State.debit) {
this.amount = this.amount.subtract(value.getAmount());
} else if (value.getState() == Payment.State.credit) {
this.amount = this.amount.add(value.getAmount());
} else {
// report to dead letter queue via exception handler
throw new RuntimeException("Invalid payment received:" + value);
}
this.lastPayment = value;
return this;
}
https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../model/AccountBalance.java

50
Payments system: event model
https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../io/confluent/kpay/payments
Event as APIEvent as API

5151
Bounded context
“Payments”
Is it enough?
no

5252
“It’s asynchronous, I don’t trust it”
(some developer, 2018)

5353
We only have one part of the picture
○ What about failures?
○ Upgrades?
○ How fast is it going?
○ What is happening - is it working?

5555
Event-streaming pillars:
1. Business function (payment)
2. Instrumentation plane (trust)
3. …
4. ...

56
Instrumentation Plane (trust)
Goal: Prove the application is meeting business requirements
Metrics:
- Payments Inﬂight, Count and Dollar value
- Payment Complete, Count and Dollar value

5757
3. Control plane (coordinate)
4. ...

58
Control Plane
Goal: Provide mechanisms to coordinate system behavior
Why: Recover from outage, DR, overload etc
Applied: Flow control, start, pause, bootstrap, scale, gate and rate limit
Model:
- Status [pause, resume)
- Gate processor [Status]
- etc

5959
3. Control plane (coordinate)
4. Operational plane (run)

6060
Dependent on Control and Instrumentation planes
Dataﬂow patterns
● Application logs
● Error/Warning logs
● Audit logs
● Lineage
● Dead-letter-queues
/dead-letter/bid/region/processor
/ops/logs/category/
/ops/metrics/elast
Stream
processor
Operational Plane

61
Architectural pillars
/payments/incoming
PAY
/payments/confirmed
Core dataﬂow
Control plane
/control/state
START
STOP
/control/status
stream.filter()
Instrumentation plane
/payments/confirmed
BIZ
METRIC
IQ
IQ
IQ
/payments/dlq
ERROR
WARN
IQ
Operational plane

Single Bounded Context (dataﬂow)
Choreography:
- Capture business function as a bounded context
- Events as API
2.
Accounts
[from]
payment.incoming
3.
Accounts
[to]
4.
Payment
Conf’d
1.
Payment
Inﬂight
payment.confirmed
payment.inflight
payment.inflight
payment.complete
payment.complete

Multiple Bounded contexts
Choreography:
- Chaining
- Layering
2.
Logistics
payment.incoming 1.
Payment
payment.complete

Multiple Bounded contexts
Orchestration
○ Captures workflow
○ Controls bounded context interaction
○ Business Process Model and Notation 2.0 (BPMN)
(Zeebe, Apache Airflow)
Source: https://docs.zeebe.io/bpmn-workflows/README.html

{faas}
Central nervous system
appappappapp
Payments Department 2
{faas}appappappapp
Department 3 Department 4
Pattern: Central nervous system

{faas}
What is going on here?
appappappapp
Payments Department 2
Patterns: Topic naming
bikeshedding (uncountable)
1. Futile investment of time and energy in
discussion of marginal technical issues.
2. Procrastination.
https://en.wiktionary.org/wiki/bikeshedding
Parkinson observed that a committee whose
job is to approve plans for a nuclear power
plant may spend the majority of its time on
relatively unimportant but easy-to-grasp
issues, such as what materials to use for the
staff bikeshed, while neglecting the design of
the power plant itself, which is far more
important but also far more difficult to
criticize constructively.

Patterns: Topic conventions
Chris:
<message type>.<dataset name>.<data name>
Variants:
<app-context>.<message type>.<data name>
<dept>.<region?>.<app-group>.<app-name>.<message type>.<data name>
source: Chris Riccomini
https://riccomini.name/how-paint-bike-shed-kafka-topic-naming-conventions
● logging
● queuing
● tracking
● etl/db
● streaming
● push
● user
Model the organization

Cloud Events
https://cloudevents.io/

Example
{
"specversion" : "0.4-wip",
"type" : "com.github.pull.create",
"source" : "https://github.com/cloudevents/spec/pull",
"subject" : "123",
"id" : "A234-1234-1234",
"time" : "2018-04-05T17:31:00Z",
"comexampleextension1" : "value",
"comexampleextension2" : {
"othervalue": 5
},
"datacontenttype" : "text/xml",
"data" : "<much wow="xml"/>"
}

Cloud Events
● Specification (an Event envelope)
● Transport Bindings (Kafka, AMQP, HTTP others)
● SDKs: Java, .NET, Go-lang etc
● Still early - nearly version 1.0
● Useful for exposing to external apps (Events as APIs)
● Support is coming for Kafka SerDes and CE.clients (sdk)
CloudEvents is a specification for describing event data in common formats
to provide interoperability across services, platforms and systems.
You will adopt CloudEvents

7575
Best practice for scale:
● Organise into Apps
● Apps comprised of dataﬂows
● Controlling context - Events as APIs
● Build once and share (instrumentation, control, lineage)
● Dataﬂow comprised of topic context (.../app/proc1)
● Topic naming conventions at Enterprise and App Level (BikeShedding!)
● Enable self-service (discovery, authorisation etc)
● Automation everywhere

7676
What about that software crisis that started in
1968?
“We believe that the major contributor to this complexity
in many systems is the handling of state and the burden
that this adds when trying to analyse and reason about
the system.”
Out of the tar pit, 2006

Our mental model: Abstraction as an Art
Chained/Orchestrated
Bounded contexts
Stream processor
Stream
Event
Pillars
Business function Control plane Instrumentation Operations
Bounded context

Key takeaway (state)
Event streamingdriven microservices are the atomic unit to:
1. Provide simplicity (and time travel)
2. Handle state (via Kafka Streams)
3. Provide a new paradigm: convergent data and logic processing
Stream
processor

Key takeaway (complexity)
● Event-Streaming apps: model as bounded-context dataﬂows, handle
state & scaling
● Patterns: Build reusable dataﬂow patterns (instrumentation)
● Composition: Bounded contexts chaining and layering
● Composition: Choreography and Orchestration

80
Questions?
@avery_neil
“Journey to event driven” blog
1. Event-ﬁrst thinking
2. Programming models
3. Serverless
4. Pillars of event-streaming ms’s
Series linked on the @avery_neil twitter proﬁle

81
THANK YOU
@avery_neil
neil@confluent.io
cnfl.io/meetups cnfl.io/blog cnfl.io/slack

The art of the event streaming application: streams, stream processors and scale ( Neil Avery, Confluent) Kafka Summit SF 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The art of the event streaming application: streams, stream processors and scale ( Neil Avery, Confluent) Kafka Summit SF 2019

Similar to The art of the event streaming application: streams, stream processors and scale ( Neil Avery, Confluent) Kafka Summit SF 2019 (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

The art of the event streaming application: streams, stream processors and scale ( Neil Avery, Confluent) Kafka Summit SF 2019