Adding some Rust to your
Kafka
About me
Gerard Klijs
● @GKlijs
● https://github.com/gklijs
● Java developer with
● Used Kafka several times
● Author of
schema_registry_converter
● Living in Papendrecht
Contents
● Test setup
● 3 languages
● 4 Implementations
● Results of the tests
● Conclusion
● Questions
Bank simulation with end-to-end performance test
Overview of the whole system
Every yellow component will be explained in details. We use the schema registry
for schema management. All messages will have string as key and avro as value
type. No d
Meaning of the colors:
● Orange are the Confluent platform parts
● Yellow are the parts of open bank mark
● Green is an NginX instance
● Light blue are PostgreSQL databases
Topology
Common dependency of the other parts.
Several functions:
● One way to set the logging for all components.
● All components have knowledge over the topics and data types without
needing to connect.
● Will generate Avro object for (de)serialization.
● Functions wrapping the (Java) Kafka Consumer and Producer.
● Functions for dealing with IBAN and UUID.
Synchronizer
Makes sure both the correct topics and schema’s are set.
Checks if it’s possible to set the replication factor to what’s in the config, takes the
minimum of the available nodes and the config.
Note that this has been used only on a clean Kafka cluster, and there is currently
no check for topic properties being correct.
Heartbeat
Just a simple producer for easy debugging.
Used a simple message with just a long value.
Exposes and nrepl. A nrepl is a network repl, which can be used to execute code
remotely and get the result back. This is a powerful concept, making it possible to
apply fixes as the code runs, or interactively solve bugs. With the nrepl the pace of
the send messages can be changed.
Command generator
Consumes the heartbeats and generates a command for each received heartbeat.
This will be ConfirmAccountCreation first, as it runs there will be less of these. It
rondomly ceates different kinds of ConfirmMoneyTransfer which might fail
because it would cause the balance to become below the limit.
Command handler
Handles the different kinds of command.
● AccountCreationCommand: generates a new iban, if it not already exists
creates a balance using the default values, if it does exists gives back an
AccountCreationFailed.
● ConfirmMoneyTransfer: if the supplied token is correct, and there is enough
money, makes the transfer. Updates both to and form if they are ‘open-bank’
ibans. And creates a BalanceChanged event for each changed balance.
GraphQL-endpoint
GraphQL-endpoint
Exposes a GraphQL endpoint to make it easy to issue commands and get the
results back in the frontend. All services have there own consumer, and share the
producer and the database.
● Transaction service: makes it possible to query or subscribe to balance
changed events.
● Account creation service: used to create an account. Will link the username
used to log in with the uuid send for the account creation, in order to get the
same iban back should the user log in at another time.
● Money transfer service: tries to transfer money, and provides feedback.
Frontend
Frontend
The frontend is build on several parts that all end up in a NginX container to be
served.
● The javascript part is build using clojurescript, an important part is the re-
graph library. For clojurescript re-frame is often used, which uses react to
update the dom depending on a global state. Clojurescript is using the Google
Closure compiler to reduce the size of the resulting javascript.
● Bulma is used for the css with just the colors set differently and some
additional animations.
● The output from the tests are added to NginX to make them easily accessible.
Running a test
When the test is run it will do several kind of transaction that either increase or
lower the money on the balance in such a way as much goes in as goes out after
10 runs. It measures the time till the new balance comes in.
During the test the load of the system is increased by using the nrepl of the
heartbeat. Increasing the number of heartbeats which in turn will trigger additional
commands to be processed.
Also during the test using lispyclouds/clj-docker-client both the cpu and memory of
parts of the system are measured.
Al the data is written into a file so it can be analyzed later on.
Output a test
The generated files can be compared to other files to generate graphs.
All the data is combined, and for each point with the same load some statistics are
calculated. Most often the mean and the standard error.
For different values graphs are generated in the public folder for the frontend so
they can be easily viewed. They are available at the background tab at open-bank.
Clojure
Clojure and Kafka
● Rich Hickley is the creator and Benevolent Dictator for Life.
● Runs on the jvm, and has interop with Java.
● Cognitect is the company behind Clojure, it has several product around
Clojure, like Datomic an elastic scaling transactional database.
● Multiple recent libraries, besides the consumer and the producer sometimes
also supporting streams, the admin client and avro.
● At the time I started the project the latest Clojure was still java 6 compatible,
and there was no recent Clojure Kafka client.
● Some fuss with Jackson in combination with other libraries, using explicit
Jackson versions to make it work.
Code example producer
Code example start consuming
Kotlin
Kotlin with Spring and Kafka
● Kotin is closely tied to the IntelliJ IDEA.
● Can change java code to Kotlin automatically.
● Bit more functional then Java, and often immutable defaults.
● Spring makes it easy to set up and have something working fast.
● Getting Avro serializers to work was a puzzle, getting the right properties to
use Avro serialisation.
● With Spring Cloud Streams is using Kafka Streams Api under the hood.
● Easiest it to start on Spring Initializr.
● Make sure to use the kotlin-maven-allopen and kotlin-maven-noarg plugin to
compile.
Code example money transfer
Rust
Rust and Kafka
● System programming language with focus on safety and speed.
● Mozilla was the first investor for Rust and continues to sponsor the work of
the open source project.
● Used by dropbox in production.
● Two libraries, one that recently is getting more active, bumped to 1.0.0 of
librdkafka, another one using pure rust, but has little activity and little features.
● No support for avro when I started.
● Created library to use the schema registry to transform bytes to Value and the
other way around, and also to set a schema in the schema registry.
● Library is more low level than Java, things like logging have to be setup.
Some examples are available making it easy.
Dockerfile pure rust library
Database update
Code example money transfer
Some results of the 10 runs on TravisCI (2 cpu)
Language Clojure Kotlin Rust(rdkafka) Rust(kafka)
Docker image size (MB) 152 206 102 8
Average start (ms) 2988 12878 2222 1929
Max load reached (msg/s) 310 330 260 220
Some graphs, more available at https://open-bank.gklijs.tech/
Latency
● Rust-kafka quickly rises because only sending one message at a time.
● Rust-rdkafka goes up eventually is stressing the Kafka broker more than the
jvm languages.
● Both jvm languages are pretty close.
Cpu load Kafka broker
● Rust-kafka is causing high cpu because every message is send seperately.
● Rust-rdkafka goes up eventually is stressing the Kafka broker more than the
jvm languages, I don’t know why.
● Both jvm languages are pretty close.
Cpu command handler
● Rust-kafka is the lowest, is pretty simple and bare docker image
● Rust-rdkafka only needs slightly more.
● Clojure is pretty close to rust, after jit has kicked in.
● Kotlin jit seems effective about the same but more overhead because of
Spring.
Conclusion, use Rust when:
● Startup time is important, but other options for the JVM with GraalVM like
Quarkus or Micronaut.
● Memory footprint matters.
● A small Docker image is important.
● Memory safety is important.
But:
● Be sure to test if in your case the broker can keep up.
● What the application needs to do can be done with Rust.
● Development may take a bit longer.
Questions? Code available at open-bank-mark

Rust kafka-5-2019-unskip

  • 1.
    Adding some Rustto your Kafka
  • 2.
    About me Gerard Klijs ●@GKlijs ● https://github.com/gklijs ● Java developer with ● Used Kafka several times ● Author of schema_registry_converter ● Living in Papendrecht
  • 3.
    Contents ● Test setup ●3 languages ● 4 Implementations ● Results of the tests ● Conclusion ● Questions
  • 4.
    Bank simulation withend-to-end performance test
  • 6.
    Overview of thewhole system Every yellow component will be explained in details. We use the schema registry for schema management. All messages will have string as key and avro as value type. No d Meaning of the colors: ● Orange are the Confluent platform parts ● Yellow are the parts of open bank mark ● Green is an NginX instance ● Light blue are PostgreSQL databases
  • 8.
    Topology Common dependency ofthe other parts. Several functions: ● One way to set the logging for all components. ● All components have knowledge over the topics and data types without needing to connect. ● Will generate Avro object for (de)serialization. ● Functions wrapping the (Java) Kafka Consumer and Producer. ● Functions for dealing with IBAN and UUID.
  • 10.
    Synchronizer Makes sure boththe correct topics and schema’s are set. Checks if it’s possible to set the replication factor to what’s in the config, takes the minimum of the available nodes and the config. Note that this has been used only on a clean Kafka cluster, and there is currently no check for topic properties being correct.
  • 12.
    Heartbeat Just a simpleproducer for easy debugging. Used a simple message with just a long value. Exposes and nrepl. A nrepl is a network repl, which can be used to execute code remotely and get the result back. This is a powerful concept, making it possible to apply fixes as the code runs, or interactively solve bugs. With the nrepl the pace of the send messages can be changed.
  • 14.
    Command generator Consumes theheartbeats and generates a command for each received heartbeat. This will be ConfirmAccountCreation first, as it runs there will be less of these. It rondomly ceates different kinds of ConfirmMoneyTransfer which might fail because it would cause the balance to become below the limit.
  • 16.
    Command handler Handles thedifferent kinds of command. ● AccountCreationCommand: generates a new iban, if it not already exists creates a balance using the default values, if it does exists gives back an AccountCreationFailed. ● ConfirmMoneyTransfer: if the supplied token is correct, and there is enough money, makes the transfer. Updates both to and form if they are ‘open-bank’ ibans. And creates a BalanceChanged event for each changed balance.
  • 17.
  • 18.
    GraphQL-endpoint Exposes a GraphQLendpoint to make it easy to issue commands and get the results back in the frontend. All services have there own consumer, and share the producer and the database. ● Transaction service: makes it possible to query or subscribe to balance changed events. ● Account creation service: used to create an account. Will link the username used to log in with the uuid send for the account creation, in order to get the same iban back should the user log in at another time. ● Money transfer service: tries to transfer money, and provides feedback.
  • 22.
  • 23.
    Frontend The frontend isbuild on several parts that all end up in a NginX container to be served. ● The javascript part is build using clojurescript, an important part is the re- graph library. For clojurescript re-frame is often used, which uses react to update the dom depending on a global state. Clojurescript is using the Google Closure compiler to reduce the size of the resulting javascript. ● Bulma is used for the css with just the colors set differently and some additional animations. ● The output from the tests are added to NginX to make them easily accessible.
  • 25.
    Running a test Whenthe test is run it will do several kind of transaction that either increase or lower the money on the balance in such a way as much goes in as goes out after 10 runs. It measures the time till the new balance comes in. During the test the load of the system is increased by using the nrepl of the heartbeat. Increasing the number of heartbeats which in turn will trigger additional commands to be processed. Also during the test using lispyclouds/clj-docker-client both the cpu and memory of parts of the system are measured. Al the data is written into a file so it can be analyzed later on.
  • 27.
    Output a test Thegenerated files can be compared to other files to generate graphs. All the data is combined, and for each point with the same load some statistics are calculated. Most often the mean and the standard error. For different values graphs are generated in the public folder for the frontend so they can be easily viewed. They are available at the background tab at open-bank.
  • 28.
  • 29.
    Clojure and Kafka ●Rich Hickley is the creator and Benevolent Dictator for Life. ● Runs on the jvm, and has interop with Java. ● Cognitect is the company behind Clojure, it has several product around Clojure, like Datomic an elastic scaling transactional database. ● Multiple recent libraries, besides the consumer and the producer sometimes also supporting streams, the admin client and avro. ● At the time I started the project the latest Clojure was still java 6 compatible, and there was no recent Clojure Kafka client. ● Some fuss with Jackson in combination with other libraries, using explicit Jackson versions to make it work.
  • 30.
  • 31.
  • 32.
  • 33.
    Kotlin with Springand Kafka ● Kotin is closely tied to the IntelliJ IDEA. ● Can change java code to Kotlin automatically. ● Bit more functional then Java, and often immutable defaults. ● Spring makes it easy to set up and have something working fast. ● Getting Avro serializers to work was a puzzle, getting the right properties to use Avro serialisation. ● With Spring Cloud Streams is using Kafka Streams Api under the hood. ● Easiest it to start on Spring Initializr. ● Make sure to use the kotlin-maven-allopen and kotlin-maven-noarg plugin to compile.
  • 35.
  • 36.
  • 37.
    Rust and Kafka ●System programming language with focus on safety and speed. ● Mozilla was the first investor for Rust and continues to sponsor the work of the open source project. ● Used by dropbox in production. ● Two libraries, one that recently is getting more active, bumped to 1.0.0 of librdkafka, another one using pure rust, but has little activity and little features. ● No support for avro when I started. ● Created library to use the schema registry to transform bytes to Value and the other way around, and also to set a schema in the schema registry. ● Library is more low level than Java, things like logging have to be setup. Some examples are available making it easy.
  • 38.
  • 39.
  • 40.
  • 41.
    Some results ofthe 10 runs on TravisCI (2 cpu) Language Clojure Kotlin Rust(rdkafka) Rust(kafka) Docker image size (MB) 152 206 102 8 Average start (ms) 2988 12878 2222 1929 Max load reached (msg/s) 310 330 260 220
  • 42.
    Some graphs, moreavailable at https://open-bank.gklijs.tech/
  • 43.
    Latency ● Rust-kafka quicklyrises because only sending one message at a time. ● Rust-rdkafka goes up eventually is stressing the Kafka broker more than the jvm languages. ● Both jvm languages are pretty close.
  • 45.
    Cpu load Kafkabroker ● Rust-kafka is causing high cpu because every message is send seperately. ● Rust-rdkafka goes up eventually is stressing the Kafka broker more than the jvm languages, I don’t know why. ● Both jvm languages are pretty close.
  • 47.
    Cpu command handler ●Rust-kafka is the lowest, is pretty simple and bare docker image ● Rust-rdkafka only needs slightly more. ● Clojure is pretty close to rust, after jit has kicked in. ● Kotlin jit seems effective about the same but more overhead because of Spring.
  • 49.
    Conclusion, use Rustwhen: ● Startup time is important, but other options for the JVM with GraalVM like Quarkus or Micronaut. ● Memory footprint matters. ● A small Docker image is important. ● Memory safety is important. But: ● Be sure to test if in your case the broker can keep up. ● What the application needs to do can be done with Rust. ● Development may take a bit longer.
  • 50.
    Questions? Code availableat open-bank-mark

Editor's Notes