Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

Learning Rust the Hard
Way for a Production
Kafka + ScyllaDB Pipeline
Presented by: Peter Corless, Director of Technical Advocacy, ScyllaDB
& Alexys Jacob, CTO, Numberly
Moderated by: Jared Ruckle, InfoQ Editor

Poll
Where are you in your NoSQL adoption?

Poll
How much data do you under management of your
transactional database?

Peter Corless
4
Director of Technical Advocacy @ ScyllaDB
+ Listen to & share user stories
+ Write blogs & case studies
+ Play (and design) strategy & roleplaying games
+ @PeterCorless on Twitter

+ Infoworld 2020 Technology of the Year!
+ Founded by designers of KVM Hypervisor
The Database Built for Gamechangers
5
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the power
a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of an
in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
+ Resolves challenges of legacy NoSQL databases
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ DBaaS/Cloud, Enterprise and Open Source solutions
+ Proven globally at scale

6
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Fast computation of flight
pricing
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Make marketing more
relevant, effective
and measurable

Alexys Jacob
7
@ultrabug
+ CTO, Numberly
+ ScyllaDB awarded Open Source & University contributor
+ Open Source author & contributor
+ Apache Avro, Apache Airflow, MongoDB, MkDocs…
+ Tech speaker & writer
+ Gentoo Linux developer
+ Python Software Foundation contributing member
Speaker Photo

Numberly, Marketing Technologist
8
Digital native, Media, CRM and data has been at the heart of
our business for the past 20 years. A data-driven approach to
help impact engagement and sales and turn your marketing
spend into a profitable investment. Optimizing your ROI is
our priority, both strategic and operational.
Numberly is a group with a solid financial strength. The
company is listed on the stock exchange, and operates
globally in 53 countries with a team of 33 nationalities.
The consistency of our CSR commitments, for over 20 years,
means that our group is CSR & gender equitable by design.
We are convinced that parity is a key factor of strong
performance and success. The recognition and loyalty of our
customers is proof of this.
Internationally recognized technological expertise and tool
agnostic: activation on our tools (CRM, Numberly trading
desk, CDP), expertise on third-party tools.
R&D investments of up to 10% of our turnover to maximize
your performance. The performance quality that we deliver
to our customers is manifested by the range of awards
received for the projects we have helped put in place.
More than 500 employees brought together by a “marketing
& tech mindset”. A focus on data and the quality of
execution and a flexible and pragmatic approach. We pass on
our passion and our know-how to our clients' teams.
Digital native & Data driven Robust & International Committed & Responsible
Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded
Digital native, Media, CRM and data has been at the heart of
our business for the past 20 years. A data-driven approach to
help impact engagement and sales and turn your marketing
spend into a profitable investment. Optimizing your ROI is
our priority, both strategic and operational.
Numberly is a group with a solid financial strength. The
company is listed on the stock exchange, and operates
globally in 53 countries with a team of 33 nationalities.
The consistency of our CSR commitments, for over 20 years,
means that our group is CSR & gender equitable by design.
We are convinced that parity is a key factor of strong
performance and success. The recognition and loyalty of our
customers is proof of this.
Internationally recognized technological expertise and tool
agnostic: activation on our tools (CRM, Numberly trading
desk, CDP), expertise on third-party tools
R&D investments of up to 10% of our turnover to maximize
your performance. The performance quality that we deliver
to our customers is manifested by the range of awards
received for the projects we have helped put in place.
More than 500 employees brought together by a “marketing
& tech mindset”. A focus on data and the quality of
execution and a flexible and pragmatic approach.
We pass on our passion and our know-how to our clients'
teams but also our commitment in the ecosystem to defend
Open Internet and the European digital sovereignty.
Digital native & Data driven Robust & International Committed & Responsible
Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded
Paris Amsterdam New-York
Dubai Montréal
Londres
Bruxelles Tel Aviv
Lyon

Agenda
+ The thought process to move from Python to Rust
+ Context, promises, arguments and decision
+ Learning Rust the hard way
+ All the stack components I had to work with in Rust
+ Tips, Open Source contributions and code samples
+ What is worth it?
+ Graphs, production numbers
+ Personal notes
9

At Numberly, we move and process (a lot of) data using Kafka streams and pipelines that are enriched
using ScyllaDB.
processor
app
processor
app
Project context at Numberly
ScyllaDB
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
11

processor
app
processor
app
Pipeline reliability = latency + resilience
Scylla
processor
app
raw data
enriched data
enriched data
app
partner
API
business
app
If a processor or ScyllaDB is slow or fails,
our business, partners & clients are at risk.
12

A major change in our pipeline processors had to be undertaken, giving us the opportunity to redesign
them entirely.
The (rusted) opportunity
ScyllaDB
processor
app
raw data
enriched data
enriched data
app
partner
API
business
app
13

“Hey, why not rewrite
those 3 Python processor apps
into 1 Rust app?”
14

The (never tried before) Rust promises
15
A language empowering everyone to build reliable and efficient software.
+ Secure
+ Memory and thread safety as first class citizens
+ No runtime or garbage collector
+ Easy to deploy
+ Compiled binaries are self-sufficient
+ No compromises
+ Strongly and statically typed
+ Exhaustivity is mandatory
+ Built-in error management syntax and primitives
+ Plays well with Python
+ PyO3 can be used to run Rust from Python (or the contrary)

Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop?
+ Fast to maintain?
+ Fast to prototype?
+ Fast to process data?
+ Fast to cover all failure cases?
“Selecting a programming language can be a form of
premature optimization
16

Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop? Python is way faster + did that for 15 years
+ Fast to maintain? Very few people at Numberly do know Rust
+ Fast to prototype? No, code must be complete to compile and run
+ Fast to process data? Sure: to prove it, measure it
+ Fast to cover all failure cases? Definitely: mandatory exhaustivity + error handling primitives
“I did not choose Rust to be “faster”.
Our Python code was fast enough
to deliver their pipeline processing.
17

Innovation cannot exist
if you don’t accept to lose time.
The question is
to know when and on what project.
18

The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes).
+ Strong type safety.
+ Compilation (debug, release).
+ Dependency management.
+ Exhaustive pattern matching.
+ Error management primitives (Result).
+ Explicit return values (Option).
19

The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes). If it compiles, it’s safe
+ Strong type safety. Predictable, readable, maintainable
+ Compilation (debug, release). Compiler is very helpful compared to random Python exceptions
+ Dependency management. Finally something looking sane vs Python mess
+ Exhaustive pattern matching. Confidence that you’re not forgetting something
+ Error management primitives (Result). Handle failure right from the language syntax
+ Explicit return values (Option). Clear separation between Some(value) and None
“
I chose Rust because it provided me with
the programming paradigms at the right abstraction level
that I needed to finally understand and
better explain the reliability and performance of my application.
20

Production is not a Hello World
+ Learning the syntax and handling errors everywhere
+ Confluent Kafka + Schema Registry + Avro
+ Asynchronous latency-optimized design
+ ScyllaDB multi-datacenter
+ MongoDB
+ Kubernetes deployment
+ Prometheus exporter
+ Grafana dashboarding
+ Sentry
Scylla
processor
app
Confluent
Kafka
22

Confluent Kafka Schema Registry
+ Confluent Schema Registry breaks vanilla Apache Avro deserialization.
+ Consider using Gerard Klijs’ schema_registry_converter crate (v3+)
+ I discovered performance problems which we worked and have been addressed!
+ Latency-overhead-free manual approach:
23

Apache Avro Rust was broken!
+ Crate apache-avro (former avro-rs) given to Apache Avro
without an appointed committer.
+ Deserialization of complex schemas was broken...
+ I contributed fixes to Apache Avro (AVRO-3232+3240)
+ Now merged thanks to Martin Grigorov!
+ Make sure to use apache-avro v0.14+
+ Rust compiler optimizations give a hell of a boost!
+ Deserializing Avro is faster than JSON!
24

+ Tricks to make your Kafka consumer strategy more efficient.
+ Deserialize your consumer messages on the consumer loop, not on green-thread tasks
+ Spawning a task has performance costs
+ Control your green-thread parallelism
+ Defer to green-thread tasks when I/O starts to be required
task / msg
Asynchronous patterns to optimize
latency
Kafka
consumer
+
avro
deserializer
raw data
task / msg
task / msg
task / msg
task / msg
Scylla
enriched data
25

Absorbing tail latency spikes with parallelism
x16
x2
parallelism load
26

Scylla Rust (shard-aware) driver
+ The scylla-rust-driver crate is production-ready.
+ Use a CachingSession to automatically cache your prepared
statements
+ Beware: prepared queries are NOT paged, use paged queries
with execute_iter() instead!
+ Use the latest optimized version 0.7.0!
27

Exporting metrics properly for
Prometheus
+ Effectively measuring latencies down to microseconds.
+ Fine tune your histogram buckets to match your expected latencies!
...
28

Grafana dashboarding
+ Graph your precious metrics right!
+ ScyllaDB prepared statement cache size
+ Query and throughput rates
+ Kafka commits occurrence
+ Errors by type
+ Kubernetes pod memory
+ ...
+ Visualizing Prom Histograms
max by (environment)(histogram_quantile(0.50, processing_latency_seconds_bucket{...}))
29

Did I really lose time because of Rust?
+ I spent more time analyzing the latency impacts of code patterns and drivers’ options than
struggling with Rust syntax.
+ Key figures for this application:
+ Kafka consumer max throughput with processing? 200K msg/s on 20 partitions
+ Avro deserialization P50 latency? 75µs
+ Scylla SELECT P50 latency on 1.5B+ rows tables? 250µs
+ Scylla INSERT P50 latency on 1.5B+ rows tables? 660µs
31

It went better than expected
+ Rust crates ecosystem is mature, similar to Python Package Index
+ 3 Python apps totalling 54 pods replaced by 1 Rust app totalling 20 pods
+ We helped & worked on making the scylla-rust-driver even better
+ Token aware policy can fallback to non-replicas for higher availability
+ Optimized partition key calculations for prepared statements
+ Expose partition key sharding to create shard-aware applications (#ScyllaSummit2023)
+ More to come!
+ This feels like the most reliable and efficient software I ever wrote!
32

- Numberly’s journey to choosing ScyllaDB
- Evaluating ScyllaDB for production 1/2
- Evaluating ScyllaDB for production 2/2
- Numberly’s use case: ScyllaDB to replace MongoDB+Hive (Scylla Summit 2018)
- Numberly’s experience: MongoDB vs Scylla (Scylla Summit 2019)
- Numberly’s contributions: Faster ScyllaDB Shard-Aware drivers (Scylla Summit 2021)
- Scylla Summit 2023: Building a 100% shard aware application using Rust
And of course:
- ScyllaDB University
Learning More
33

Join our enthusiastic teams and help us face all our challenges with an innovative, benevolent
and community driven mindset!
- Data Engineering & Science
- Software Engineering
- Infrastructure
We are remote friendly, so wherever you are, let’s have a chat!
alexys@numberly.com
Numberly is hiring
34

Watch now on-demand at
scylladb.com/summit
35
Questions?

Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

Similar to Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

Editor's Notes