Event-sourced systems
With Kafka, Clojure, and Jackdaw
I’m Bryce Covert
VP Engineering Solutions @ PeloTech
I’m Bryce Covert
VP Engineering Solutions @ PeloTech
I don’t like mayonnaise
Definition - Event-sourced systems
• event (noun) - something that occurred in a certain place during a particular moment of
time
• sourced (verb) - to give or trace the source for something
State
Event
state = ƒ(events)
Definition - Event-sourced systems
• event (noun) - something that occurred in a certain place during a particular moment of
time
• sourced (verb) - to give or trace the source for something
Definition - Event-sourced systems
• event (noun) - something that occurred in a certain place during a particular moment of
time
• sourced (verb) - to give or trace the source for something
• event-sourced system - a system that permanently retains traces for everything that
occurs?
Definition - Event-sourced systems
• event (noun) - something that occurred in a certain place during a particular moment of
time
• sourced (verb) - to give or trace the source for something
• event-sourced system - a system that permanently retains traces for everything that
occurs?
• Event-sourced system - a system whose state is entirely traced back to things that have
already occurred ( i.e., state = ƒ(events) )
Why?
• See Datomic
• Limit point-to-point communications / commands
• Simulation
• Replay
• Reconcilable
• Recoverable
Simulation
Simulation
Simulation
Simulation
Simulation
Why?
• See Datomic
• Limit point-to-point communications / commands
• Simulation
• Replay
• Reconcilable
• Recoverable
How to build Event-sourced systems, sort of
• Webhooks
• Queues
• Pull-based query
How to build Event-sourced systems
• Kafka
• Kafka Streams library
• Clojure (w/ jackdaw)
How to build Event-sourced systems
• Other viable approaches
• Flink
• Spark Streaming
• Akka Streaming
What we’ll build
{:event-type :passenger-boarded
:who "Leslie Nielsen"
:time #inst "2019-03-16T00:00:00.000-00:00"
:flight "UA1496"}]
[{:flight "UA1496"}
{:event-type :departed
:time #inst "2019-03-16T00:00:00.000-00:00"
:flight "UA1496"
:scheduled-departure #inst "2019-03-15T00:00:00.000-00:00"}]
[{:flight "UA1496"}
{:event-type :arrived
:time #inst "2019-03-17T04:00:00.000-00:00"
:flight "UA1496"}]
[{:flight "UA1496"}
{:event-type :passenger-departed
:who "Leslie Nielsen"
What we’ll build
• Analytics
• Which flights are delayed and on time?
• How long is a flight in the air?
• Who is on the plane?
• Count passengers while they board the plane
• Query
• Are my friends on the plane?
• Decisions
• Clean the plane when the last passenger leaves
• Simulate (fixing a bug)
Kafka
Producers / Consumers
(with-open [producer (jc/producer app-config topic-config)]
@(jc/produce! producer topic-config k v))
(with-open [subscription (jc/subscribed-consumer app-config
topic-config)]
(loop [results (jc/poll subscription 200)]
(doseq [result results]
(process-result result))
(recur (jc/poll subscription 200))))
Kafka Streams
Kafka Streams - Prior art
• Transducers - separate “the plan” from execution
• Netflix Reactive Extensions (Rx)
• Apache Samza
Kafka Streams
(-> builder
(j/kstream (topic-config "flight-events"))
(j/filter (fn [[k v]]
(= (:event-type v) :departed)))
(j/to (topic-config "flight-departed-events")))
What we’ll build
• Analytics
• Which flights are delayed and on time?
• How long is a flight in the air?
• Who is on the plane?
• Count passengers while they board the plane
• Query
• Are my friends on the plane?
• Decisions
• Clean the plane when the last passenger leaves
• Simulate (fixing a bug)
What we’ll build
• Analytics (Kafka topic → processing → Kafka Topic)
• Which flights are delayed and on time?
• How long is a flight in the air?
• Who is on the plane?
• Count passengers while they board the plane
• Query (Kafka topic → processing → State Store)
• Are my friends on the plane?
• Decisions (Kafka topic → processing → side effect)
• Clean the plane when the last passenger leaves
• Simulate (fixing a bug)
Jackdaw
From FundingCircle
Demo
Best Practices
• Determinism
• Corrections
• Topic Hierarchy
• Transactionality
• Event Structure
• Retention
Thanks!
• https://github.com/brycecovert/clojure-event-sourcing
• Slides on slideshare.net
More Resources
• https://docs.confluent.io/current/streams/developer-guide/dsl-api.html
• https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KS
tream.html
• https://github.com/FundingCircle/jackdaw

Event-sourced systems with Kafka, Clojure, and Jackdaw

  • 1.
  • 2.
    I’m Bryce Covert VPEngineering Solutions @ PeloTech
  • 3.
    I’m Bryce Covert VPEngineering Solutions @ PeloTech I don’t like mayonnaise
  • 4.
    Definition - Event-sourcedsystems • event (noun) - something that occurred in a certain place during a particular moment of time • sourced (verb) - to give or trace the source for something
  • 5.
  • 6.
  • 7.
  • 8.
    Definition - Event-sourcedsystems • event (noun) - something that occurred in a certain place during a particular moment of time • sourced (verb) - to give or trace the source for something
  • 9.
    Definition - Event-sourcedsystems • event (noun) - something that occurred in a certain place during a particular moment of time • sourced (verb) - to give or trace the source for something • event-sourced system - a system that permanently retains traces for everything that occurs?
  • 10.
    Definition - Event-sourcedsystems • event (noun) - something that occurred in a certain place during a particular moment of time • sourced (verb) - to give or trace the source for something • event-sourced system - a system that permanently retains traces for everything that occurs? • Event-sourced system - a system whose state is entirely traced back to things that have already occurred ( i.e., state = ƒ(events) )
  • 11.
    Why? • See Datomic •Limit point-to-point communications / commands • Simulation • Replay • Reconcilable • Recoverable
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
    Why? • See Datomic •Limit point-to-point communications / commands • Simulation • Replay • Reconcilable • Recoverable
  • 18.
    How to buildEvent-sourced systems, sort of • Webhooks • Queues • Pull-based query
  • 19.
    How to buildEvent-sourced systems • Kafka • Kafka Streams library • Clojure (w/ jackdaw)
  • 20.
    How to buildEvent-sourced systems • Other viable approaches • Flink • Spark Streaming • Akka Streaming
  • 21.
  • 22.
    {:event-type :passenger-boarded :who "LeslieNielsen" :time #inst "2019-03-16T00:00:00.000-00:00" :flight "UA1496"}] [{:flight "UA1496"} {:event-type :departed :time #inst "2019-03-16T00:00:00.000-00:00" :flight "UA1496" :scheduled-departure #inst "2019-03-15T00:00:00.000-00:00"}] [{:flight "UA1496"} {:event-type :arrived :time #inst "2019-03-17T04:00:00.000-00:00" :flight "UA1496"}] [{:flight "UA1496"} {:event-type :passenger-departed :who "Leslie Nielsen"
  • 23.
    What we’ll build •Analytics • Which flights are delayed and on time? • How long is a flight in the air? • Who is on the plane? • Count passengers while they board the plane • Query • Are my friends on the plane? • Decisions • Clean the plane when the last passenger leaves • Simulate (fixing a bug)
  • 24.
  • 26.
    Producers / Consumers (with-open[producer (jc/producer app-config topic-config)] @(jc/produce! producer topic-config k v)) (with-open [subscription (jc/subscribed-consumer app-config topic-config)] (loop [results (jc/poll subscription 200)] (doseq [result results] (process-result result)) (recur (jc/poll subscription 200))))
  • 27.
  • 28.
    Kafka Streams -Prior art • Transducers - separate “the plan” from execution • Netflix Reactive Extensions (Rx) • Apache Samza
  • 30.
    Kafka Streams (-> builder (j/kstream(topic-config "flight-events")) (j/filter (fn [[k v]] (= (:event-type v) :departed))) (j/to (topic-config "flight-departed-events")))
  • 31.
    What we’ll build •Analytics • Which flights are delayed and on time? • How long is a flight in the air? • Who is on the plane? • Count passengers while they board the plane • Query • Are my friends on the plane? • Decisions • Clean the plane when the last passenger leaves • Simulate (fixing a bug)
  • 32.
    What we’ll build •Analytics (Kafka topic → processing → Kafka Topic) • Which flights are delayed and on time? • How long is a flight in the air? • Who is on the plane? • Count passengers while they board the plane • Query (Kafka topic → processing → State Store) • Are my friends on the plane? • Decisions (Kafka topic → processing → side effect) • Clean the plane when the last passenger leaves • Simulate (fixing a bug)
  • 33.
  • 34.
  • 35.
    Best Practices • Determinism •Corrections • Topic Hierarchy • Transactionality • Event Structure • Retention
  • 37.
  • 38.
    More Resources • https://docs.confluent.io/current/streams/developer-guide/dsl-api.html •https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KS tream.html • https://github.com/FundingCircle/jackdaw

Editor's Notes

  • #5 In this case, “something “ is state
  • #6 State might be “Bryce is not hungry”
  • #7 An event might be “Bryce ate a sandwich at noon”
  • #10 Is it just an append-only ledger?
  • #12 Datomic “A transactional database with a flexible data model, elastic scaling, and rich queries.” Transactional / Indelible / Chronological Time travel - you can see what you thought the world looked like as-of a particular point in time Point-to-point — TODO GIVE EXAMPLE Poor abstraction - I have to know what needs to happen when something happens Brittle High cost of failure
  • #18 Simulation - What decisions would my system make if X happened? What if the sandwich has mayo on it? or Replay - Fix a bug, how do I fix the data? Reconcilable - You say X, I say Y. Who’s right, and how do we get in agreement? Recoverable - If my system goes down, how does it get corrected?
  • #19 Webhooks allow consumers to subscribe, but no replay Most queues (i.e., SQS only support single consumers) Pull-based - latency, idempotent None of these are transactional
  • #20 Show of hands Kafka, Kafka Streams library, jackdaw
  • #22 An event-sourced system that tracks flight departures
  • #23 4 events An event is a key-value pair, which is represented by this tuple-vector
  • #30 How long was the flight in the air? This gets transformed into a “topology”
  • #32 Going back to what we build, we can rethink these categories
  • #36 * Determinism - you can't guarantee order across different partitions. Spread your keys out as much as possible, while ensuring order is respected. PartitionKey can be separate from the event's key. * Determinism - Turn the process off for a week. Same results as leaving it on? * Corrections You can't delete data. Your systems, and downstream systems need to be able to handle corrections. At least two ways - an event with the same id is a corrected event, or you can proactively void events * Topic hierarchy - Observations and Decisions shouldn't be sourced from the same topic. You can't simulate. Prefer to merge the streams. * Transactionality - Exactly-once delivery * Event structure - State of the world is a graph, changes or updates to that graph. Small event domain = customized events. Large, complex, event domain = be explicit about the graph changes for your consumers * Retention - Kafka by default keeps data for a certain amount of time. If you can't store all history, I recommend using compaction, will make sure that there is always at least one record per key