Event driven architectures with Kinesis

Mark Harrison
Event driven architectures with Kinesis
Justin Potter

3
● MONOLITH!
● Background
● Microservice spaghetti
● Microservice eventing
● Kinesis Overview
● (Soon to be) Open source Kinesis Driver
● Join Us
Agenda

4
The traditional Oracle backed monolith architecture
● Tight and ever increasing coupling
● Difficult to scale with users and features
● Difficult to maintain
● Difficult to onboard new developers
● Lacked modularity
Long ago in a …...

5
Background
Journal (Tracking) - When a user enters a food, weight, or activity into Weight
Watchers, it is sent to Journal.
Program (Points Calculation) - When a user wishes to view their Weight Watchers
points, a call is made to Program to calculate and retrieve their point allocation.
Program depends on the Journal service for it’s food tracking.

6
Microservices!!
● Scala, Akka, Play, Cassandra
● REST based services
● Each service represents a single domain concept
○ User Profile, Entitlements, Program …
We needed something different!

8
It turns out magic bullets aren’t magic after all!!
● Features cross service boundaries, a LOT
● New features often increase requests between services
○ So one request now hits two services, that’s a 100% increase!
● Immediate consistency means reduced availability
○ I’m looking at you… REST
● Scaling out worked ok, just add more nodes!
● Broadcasting data to other teams result in a direct dependency
● Not enough emphasis on logging and monitoring
So… how’d that work out for you???

Domain-ish
Driven
Design
Pros
12

Easier to onboard
developers
Pros
13

Convoluted
JSON
Responses
Cons
16

Complicated
Integration
Testing
Cons
19

No way to broadcast
events to other
teams
Cons
20

Data
Duplication
Between
Services
Cons
21

More “Reactive”
● Better monitoring
● Decouple the services
● More concise event payloads
● Services hold their own state
● Backpressure
Fix all the things!!!
22

Considerations...
● Accept that Eventual consistency is inevitable
● Some services do too many things, some should be merged together!
● The APIs will give the latest known state
● Deal with the fact that duplicates will happen
● Did I mention better monitoring??
But… How? What? Um...
23

Think Kafka, but not :)
● “Real-time” streaming platform
● Multiple applications can publish and consumer to/from the same stream
● Geared at higher latency workloads
● Messages are consumed in batches
● Elastic - easy to scale up and down
● Some interesting constraints (more on that soon!)
Kinesis
25

● Stream - An ordered sequence of data records, each stream has a unique name
● Data Record - Unit of data stored in a Stream. Composed of a Sequence number, Partition
Key and Data Blob.
● Partition key - Used to control distribution of records
● Sequence Number - Each record has a sequence number. Sequence numbers for the same
partition key generally increase over time (non-sequentially).
● SubSequence Number - When aggregating records, multiple will records in the batch will
share a sequence number. In this instance, a SubSequence Number is used in combination to
uniquely identify records.
Key concepts
26

Even more key concepts
● Shard - A group of data records in a
stream. A stream has one or more Shards.
A Shard is a unit of throughput capacity
and therefore determines the throughput
of the Stream
● Producer - Puts messages onto a Shard
● Consumer - Gets data records from one
or more Shards. If multiple consumers
share a name, they therefore share a
checkpoint position.
● Checkpointing - The per consumer
process of tracking the latest consumed
record.
27

Constraints
Wait.. it’s not all sunshine and roses?
● Data can be persisted in Kinesis for up to 7 days, with an initial default of 1 day.
● A Shard is a unit of throughput capacity
○ Reads - up to 5 transactions per second, with a maximum total data read rate of 2 MB
per second
○ Writes - up to 1,000 records per second, up to a maximum total data write rate of 1 MB
per second (including partition keys)
● When one application has multiple consumers, thereby sharing one checkpoint position, you
must have at least one shard per instance
○ Think of a database table which tracks the current progress, in which the primary key is a
combination of the application name and shard id
● You are charged on a per shard basis
28

Interfacing with Kinesis
Out of the box, Amazon provides two libraries for programmatically interfacing with Kinesis
● KPL - Kinesis Producer Library
● KCL - Kinesis Consumer Library
Both are available in Java and handle a number of low level concerns
● Stream connection and disconnection
● Enumeration of shards
● Parallel processing of the stream: consuming from and producing to a number of shards
● Shard worker allocation and reallocation, balancing shards across workers
● Batching and aggregation of records
29

So what’s lacking???
Nobody’s perfect, right?
● Java only, usage involves some interesting use of inheritance
● Asynchronous & non-blocking processing on the consumer
● Fool proof and non-blocking checkpointing
● Throttling to reduce memory footprint
● Smarter per message checkpointing
● Hard to prevent the driver code becoming tangled with your
business logic
30

Introducing...
The Weight Watchers Kinesis client
<Insert cool logo here>
Coming to a github repo near you soon…..
31

Producer
Scala & (optionally) Akka based producer
● Wraps the KPL driver
● Choice of Scala Future or Akka based interface
● Scala interface
○ Returns a Future for each message
○ Completes when send (batch) is successful
● Actor interface
○ Fire and forget or callback messages
○ Optional throttling to limit the number of unsent
messages and therefore Futures
32

Consumer
Scala & Akka based consumer
● Wraps the KCL library
● Provides fool proof checkpointing
○ Allows message failures within a configurable threshold
● Messages sent for processing to provided Actor
● Configurable retries
● Asynchronous processing and checkpointing
34

Performance
The performance scales reasonably well with the number of shards,
with consistent increases as each new shard is added.
1 Shard - 5,000,000 messages:
Records/sec: 42016
Seconds elapsed: 119
2 Shards - 5,000,000 messages:
Records/sec: 74626
Seconds elapsed: 67
5 Shards - - 10,000,000 messages
Records/sec: 140845
Seconds elapsed: 71
43

Mark Harrison
@markglh
Justin Potter
We’re Hiring!!
www.weightwatchers.com/us/corporate-careers
Or email: Joanna.mark@weightwatchers.com

Event driven architectures with Kinesis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Event driven architectures with Kinesis

Similar to Event driven architectures with Kinesis (20)

Recently uploaded

Recently uploaded (20)

Event driven architectures with Kinesis