SlideShare a Scribd company logo
1 of 36
Learning Rust the Hard
Way for a Production
Kafka + ScyllaDB Pipeline
Presented by: Peter Corless, Director of Technical Advocacy, ScyllaDB
& Alexys Jacob, CTO, Numberly
Moderated by: Jared Ruckle, InfoQ Editor
Poll
Where are you in your NoSQL adoption?
Poll
How much data do you under management of your
transactional database?
Peter Corless
4
Director of Technical Advocacy @ ScyllaDB
+ Listen to & share user stories
+ Write blogs & case studies
+ Play (and design) strategy & roleplaying games
+ @PeterCorless on Twitter
+ Infoworld 2020 Technology of the Year!
+ Founded by designers of KVM Hypervisor
The Database Built for Gamechangers
5
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the power
a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of an
in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
+ Resolves challenges of legacy NoSQL databases
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ DBaaS/Cloud, Enterprise and Open Source solutions
+ Proven globally at scale
6
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Fast computation of flight
pricing
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Make marketing more
relevant, effective
and measurable
Alexys Jacob
7
@ultrabug
+ CTO, Numberly
+ ScyllaDB awarded Open Source & University contributor
+ Open Source author & contributor
+ Apache Avro, Apache Airflow, MongoDB, MkDocs…
+ Tech speaker & writer
+ Gentoo Linux developer
+ Python Software Foundation contributing member
Speaker Photo
Numberly, Marketing Technologist
8
Digital native, Media, CRM and data has been at the heart of
our business for the past 20 years. A data-driven approach to
help impact engagement and sales and turn your marketing
spend into a profitable investment. Optimizing your ROI is
our priority, both strategic and operational.
Numberly is a group with a solid financial strength. The
company is listed on the stock exchange, and operates
globally in 53 countries with a team of 33 nationalities.
The consistency of our CSR commitments, for over 20 years,
means that our group is CSR & gender equitable by design.
We are convinced that parity is a key factor of strong
performance and success. The recognition and loyalty of our
customers is proof of this.
Internationally recognized technological expertise and tool
agnostic: activation on our tools (CRM, Numberly trading
desk, CDP), expertise on third-party tools.
R&D investments of up to 10% of our turnover to maximize
your performance. The performance quality that we deliver
to our customers is manifested by the range of awards
received for the projects we have helped put in place.
More than 500 employees brought together by a “marketing
& tech mindset”. A focus on data and the quality of
execution and a flexible and pragmatic approach. We pass on
our passion and our know-how to our clients' teams.
Digital native & Data driven Robust & International Committed & Responsible
Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded
Digital native, Media, CRM and data has been at the heart of
our business for the past 20 years. A data-driven approach to
help impact engagement and sales and turn your marketing
spend into a profitable investment. Optimizing your ROI is
our priority, both strategic and operational.
Numberly is a group with a solid financial strength. The
company is listed on the stock exchange, and operates
globally in 53 countries with a team of 33 nationalities.
The consistency of our CSR commitments, for over 20 years,
means that our group is CSR & gender equitable by design.
We are convinced that parity is a key factor of strong
performance and success. The recognition and loyalty of our
customers is proof of this.
Internationally recognized technological expertise and tool
agnostic: activation on our tools (CRM, Numberly trading
desk, CDP), expertise on third-party tools
R&D investments of up to 10% of our turnover to maximize
your performance. The performance quality that we deliver
to our customers is manifested by the range of awards
received for the projects we have helped put in place.
More than 500 employees brought together by a “marketing
& tech mindset”. A focus on data and the quality of
execution and a flexible and pragmatic approach.
We pass on our passion and our know-how to our clients'
teams but also our commitment in the ecosystem to defend
Open Internet and the European digital sovereignty.
Digital native & Data driven Robust & International Committed & Responsible
Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded
Paris Amsterdam New-York
Dubai Montréal
Londres
Bruxelles Tel Aviv
Lyon
Agenda
+ The thought process to move from Python to Rust
+ Context, promises, arguments and decision
+ Learning Rust the hard way
+ All the stack components I had to work with in Rust
+ Tips, Open Source contributions and code samples
+ What is worth it?
+ Graphs, production numbers
+ Personal notes
9
Choosing Rust over
Python
10
At Numberly, we move and process (a lot of) data using Kafka streams and pipelines that are enriched
using ScyllaDB.
processor
app
processor
app
Project context at Numberly
ScyllaDB
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
11
processor
app
processor
app
Pipeline reliability = latency + resilience
Scylla
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
If a processor or ScyllaDB is slow or fails,
our business, partners & clients are at risk.
12
A major change in our pipeline processors had to be undertaken, giving us the opportunity to redesign
them entirely.
The (rusted) opportunity
ScyllaDB
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
13
“Hey, why not rewrite
those 3 Python processor apps
into 1 Rust app?”
14
The (never tried before) Rust promises
15
A language empowering everyone to build reliable and efficient software.
+ Secure
+ Memory and thread safety as first class citizens
+ No runtime or garbage collector
+ Easy to deploy
+ Compiled binaries are self-sufficient
+ No compromises
+ Strongly and statically typed
+ Exhaustivity is mandatory
+ Built-in error management syntax and primitives
+ Plays well with Python
+ PyO3 can be used to run Rust from Python (or the contrary)
Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop?
+ Fast to maintain?
+ Fast to prototype?
+ Fast to process data?
+ Fast to cover all failure cases?
“Selecting a programming language can be a form of
premature optimization
16
Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop? Python is way faster + did that for 15 years
+ Fast to maintain? Very few people at Numberly do know Rust
+ Fast to prototype? No, code must be complete to compile and run
+ Fast to process data? Sure: to prove it, measure it
+ Fast to cover all failure cases? Definitely: mandatory exhaustivity + error handling primitives
“I did not choose Rust to be “faster”.
Our Python code was fast enough
to deliver their pipeline processing.
17
Innovation cannot exist
if you don’t accept to lose time.
The question is
to know when and on what project.
18
The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes).
+ Strong type safety.
+ Compilation (debug, release).
+ Dependency management.
+ Exhaustive pattern matching.
+ Error management primitives (Result).
+ Explicit return values (Option).
19
The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes). If it compiles, it’s safe
+ Strong type safety. Predictable, readable, maintainable
+ Compilation (debug, release). Compiler is very helpful compared to random Python exceptions
+ Dependency management. Finally something looking sane vs Python mess
+ Exhaustive pattern matching. Confidence that you’re not forgetting something
+ Error management primitives (Result). Handle failure right from the language syntax
+ Explicit return values (Option). Clear separation between Some(value) and None
“
I chose Rust because it provided me with
the programming paradigms at the right abstraction level
that I needed to finally understand and
better explain the reliability and performance of my application.
20
Learning Rust the hard
way
21
Production is not a Hello World
+ Learning the syntax and handling errors everywhere
+ Confluent Kafka + Schema Registry + Avro
+ Asynchronous latency-optimized design
+ ScyllaDB multi-datacenter
+ MongoDB
+ Kubernetes deployment
+ Prometheus exporter
+ Grafana dashboarding
+ Sentry
Scylla
processor
app
Confluent
Kafka
22
Confluent Kafka Schema Registry
+ Confluent Schema Registry breaks vanilla Apache Avro deserialization.
+ Consider using Gerard Klijs’ schema_registry_converter crate (v3+)
+ I discovered performance problems which we worked and have been addressed!
+ Latency-overhead-free manual approach:
23
Apache Avro Rust was broken!
+ Crate apache-avro (former avro-rs) given to Apache Avro
without an appointed committer.
+ Deserialization of complex schemas was broken...
+ I contributed fixes to Apache Avro (AVRO-3232+3240)
+ Now merged thanks to Martin Grigorov!
+ Make sure to use apache-avro v0.14+
+ Rust compiler optimizations give a hell of a boost!
+ Deserializing Avro is faster than JSON!
24
+ Tricks to make your Kafka consumer strategy more efficient.
+ Deserialize your consumer messages on the consumer loop, not on green-thread tasks
+ Spawning a task has performance costs
+ Control your green-thread parallelism
+ Defer to green-thread tasks when I/O starts to be required
task / msg
Asynchronous patterns to optimize
latency
Kafka
consumer
+
avro
deserializer
raw data
task / msg
task / msg
task / msg
task / msg
Scylla
enriched data
25
Absorbing tail latency spikes with parallelism
x16
x2
parallelism load
26
Scylla Rust (shard-aware) driver
+ The scylla-rust-driver crate is production-ready.
+ Use a CachingSession to automatically cache your prepared
statements
+ Beware: prepared queries are NOT paged, use paged queries
with execute_iter() instead!
+ Use the latest optimized version 0.7.0!
27
Exporting metrics properly for
Prometheus
+ Effectively measuring latencies down to microseconds.
+ Fine tune your histogram buckets to match your expected latencies!
...
28
Grafana dashboarding
+ Graph your precious metrics right!
+ ScyllaDB prepared statement cache size
+ Query and throughput rates
+ Kafka commits occurrence
+ Errors by type
+ Kubernetes pod memory
+ ...
+ Visualizing Prom Histograms
max by (environment)(histogram_quantile(0.50, processing_latency_seconds_bucket{...}))
29
Was it worth it?
30
Did I really lose time because of Rust?
+ I spent more time analyzing the latency impacts of code patterns and drivers’ options than
struggling with Rust syntax.
+ Key figures for this application:
+ Kafka consumer max throughput with processing? 200K msg/s on 20 partitions
+ Avro deserialization P50 latency? 75µs
+ Scylla SELECT P50 latency on 1.5B+ rows tables? 250µs
+ Scylla INSERT P50 latency on 1.5B+ rows tables? 660µs
31
It went better than expected
+ Rust crates ecosystem is mature, similar to Python Package Index
+ 3 Python apps totalling 54 pods replaced by 1 Rust app totalling 20 pods
+ We helped & worked on making the scylla-rust-driver even better
+ Token aware policy can fallback to non-replicas for higher availability
+ Optimized partition key calculations for prepared statements
+ Expose partition key sharding to create shard-aware applications (#ScyllaSummit2023)
+ More to come!
+ This feels like the most reliable and efficient software I ever wrote!
32
- Numberly’s journey to choosing ScyllaDB
- Evaluating ScyllaDB for production 1/2
- Evaluating ScyllaDB for production 2/2
- Numberly’s use case: ScyllaDB to replace MongoDB+Hive (Scylla Summit 2018)
- Numberly’s experience: MongoDB vs Scylla (Scylla Summit 2019)
- Numberly’s contributions: Faster ScyllaDB Shard-Aware drivers (Scylla Summit 2021)
- Scylla Summit 2023: Building a 100% shard aware application using Rust
And of course:
- ScyllaDB University
Learning More
33
Join our enthusiastic teams and help us face all our challenges with an innovative, benevolent
and community driven mindset!
- Data Engineering & Science
- Software Engineering
- Infrastructure
We are remote friendly, so wherever you are, let’s have a chat!
alexys@numberly.com
Numberly is hiring
34
Watch now on-demand at
scylladb.com/summit
35
Questions?
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

What's hot

Magnet Shuffle Service: Push-based Shuffle at LinkedIn
Magnet Shuffle Service: Push-based Shuffle at LinkedInMagnet Shuffle Service: Push-based Shuffle at LinkedIn
Magnet Shuffle Service: Push-based Shuffle at LinkedIn
Databricks
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 

What's hot (20)

Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta Lake
 
Magnet Shuffle Service: Push-based Shuffle at LinkedIn
Magnet Shuffle Service: Push-based Shuffle at LinkedInMagnet Shuffle Service: Push-based Shuffle at LinkedIn
Magnet Shuffle Service: Push-based Shuffle at LinkedIn
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Similar to Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
confluent
 
IT Six Business Models
IT Six Business Models  IT Six Business Models
IT Six Business Models
dtusaliu
 

Similar to Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline (20)

Approaching risk management with your head in the cloud
Approaching risk management with your head in the cloudApproaching risk management with your head in the cloud
Approaching risk management with your head in the cloud
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation
 
The Value Plus Magazine - October GITEX Issue
The Value Plus Magazine - October GITEX IssueThe Value Plus Magazine - October GITEX Issue
The Value Plus Magazine - October GITEX Issue
 
Patternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckPatternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase Deck
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...Before vs After: Redesigning a Website to be Useful and Informative for Devel...
Before vs After: Redesigning a Website to be Useful and Informative for Devel...
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Acctiva: expertise in Business Intelligence, Data Warehousing, Data Governance
Acctiva: expertise in Business Intelligence, Data Warehousing, Data GovernanceAcctiva: expertise in Business Intelligence, Data Warehousing, Data Governance
Acctiva: expertise in Business Intelligence, Data Warehousing, Data Governance
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
IT Six Business Models
IT Six Business Models  IT Six Business Models
IT Six Business Models
 
iXora Solution Ltd. Presentation
iXora Solution Ltd. PresentationiXora Solution Ltd. Presentation
iXora Solution Ltd. Presentation
 
About : Radius knowledge Labs
About : Radius knowledge LabsAbout : Radius knowledge Labs
About : Radius knowledge Labs
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
Gen AI Cognizant & AWS event presentation_12 Oct.pdfGen AI Cognizant & AWS event presentation_12 Oct.pdf
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
 
Hashroot Technologies | Server Management | Cloud Management | Security Servi...
Hashroot Technologies | Server Management | Cloud Management | Security Servi...Hashroot Technologies | Server Management | Cloud Management | Security Servi...
Hashroot Technologies | Server Management | Cloud Management | Security Servi...
 
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the ITCIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 

More from ScyllaDB

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling Mistakes
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 

Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline

  • 1. Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline Presented by: Peter Corless, Director of Technical Advocacy, ScyllaDB & Alexys Jacob, CTO, Numberly Moderated by: Jared Ruckle, InfoQ Editor
  • 2. Poll Where are you in your NoSQL adoption?
  • 3. Poll How much data do you under management of your transactional database?
  • 4. Peter Corless 4 Director of Technical Advocacy @ ScyllaDB + Listen to & share user stories + Write blogs & case studies + Play (and design) strategy & roleplaying games + @PeterCorless on Twitter
  • 5. + Infoworld 2020 Technology of the Year! + Founded by designers of KVM Hypervisor The Database Built for Gamechangers 5 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor + Resolves challenges of legacy NoSQL databases + >5x higher throughput + >20x lower latency + >75% TCO savings + DBaaS/Cloud, Enterprise and Open Source solutions + Proven globally at scale
  • 6. 6 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Fast computation of flight pricing Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Video recommendation management Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking Serving dynamic live streams at scale Powering India's top social media platform Personalized advertising to players Distribution of game assets in Unreal Engine Make marketing more relevant, effective and measurable
  • 7. Alexys Jacob 7 @ultrabug + CTO, Numberly + ScyllaDB awarded Open Source & University contributor + Open Source author & contributor + Apache Avro, Apache Airflow, MongoDB, MkDocs… + Tech speaker & writer + Gentoo Linux developer + Python Software Foundation contributing member Speaker Photo
  • 8. Numberly, Marketing Technologist 8 Digital native, Media, CRM and data has been at the heart of our business for the past 20 years. A data-driven approach to help impact engagement and sales and turn your marketing spend into a profitable investment. Optimizing your ROI is our priority, both strategic and operational. Numberly is a group with a solid financial strength. The company is listed on the stock exchange, and operates globally in 53 countries with a team of 33 nationalities. The consistency of our CSR commitments, for over 20 years, means that our group is CSR & gender equitable by design. We are convinced that parity is a key factor of strong performance and success. The recognition and loyalty of our customers is proof of this. Internationally recognized technological expertise and tool agnostic: activation on our tools (CRM, Numberly trading desk, CDP), expertise on third-party tools. R&D investments of up to 10% of our turnover to maximize your performance. The performance quality that we deliver to our customers is manifested by the range of awards received for the projects we have helped put in place. More than 500 employees brought together by a “marketing & tech mindset”. A focus on data and the quality of execution and a flexible and pragmatic approach. We pass on our passion and our know-how to our clients' teams. Digital native & Data driven Robust & International Committed & Responsible Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded Digital native, Media, CRM and data has been at the heart of our business for the past 20 years. A data-driven approach to help impact engagement and sales and turn your marketing spend into a profitable investment. Optimizing your ROI is our priority, both strategic and operational. Numberly is a group with a solid financial strength. The company is listed on the stock exchange, and operates globally in 53 countries with a team of 33 nationalities. The consistency of our CSR commitments, for over 20 years, means that our group is CSR & gender equitable by design. We are convinced that parity is a key factor of strong performance and success. The recognition and loyalty of our customers is proof of this. Internationally recognized technological expertise and tool agnostic: activation on our tools (CRM, Numberly trading desk, CDP), expertise on third-party tools R&D investments of up to 10% of our turnover to maximize your performance. The performance quality that we deliver to our customers is manifested by the range of awards received for the projects we have helped put in place. More than 500 employees brought together by a “marketing & tech mindset”. A focus on data and the quality of execution and a flexible and pragmatic approach. We pass on our passion and our know-how to our clients' teams but also our commitment in the ecosystem to defend Open Internet and the European digital sovereignty. Digital native & Data driven Robust & International Committed & Responsible Passionate & Collaborative Tech Experts & Agnostic Innovative & Awarded Paris Amsterdam New-York Dubai Montréal Londres Bruxelles Tel Aviv Lyon
  • 9. Agenda + The thought process to move from Python to Rust + Context, promises, arguments and decision + Learning Rust the hard way + All the stack components I had to work with in Rust + Tips, Open Source contributions and code samples + What is worth it? + Graphs, production numbers + Personal notes 9
  • 11. At Numberly, we move and process (a lot of) data using Kafka streams and pipelines that are enriched using ScyllaDB. processor app processor app Project context at Numberly ScyllaDB processor app raw data enriched data enriched data enriched data client app partner API business app 11
  • 12. processor app processor app Pipeline reliability = latency + resilience Scylla processor app raw data enriched data enriched data enriched data client app partner API business app If a processor or ScyllaDB is slow or fails, our business, partners & clients are at risk. 12
  • 13. A major change in our pipeline processors had to be undertaken, giving us the opportunity to redesign them entirely. The (rusted) opportunity ScyllaDB processor app raw data enriched data enriched data enriched data client app partner API business app 13
  • 14. “Hey, why not rewrite those 3 Python processor apps into 1 Rust app?” 14
  • 15. The (never tried before) Rust promises 15 A language empowering everyone to build reliable and efficient software. + Secure + Memory and thread safety as first class citizens + No runtime or garbage collector + Easy to deploy + Compiled binaries are self-sufficient + No compromises + Strongly and statically typed + Exhaustivity is mandatory + Built-in error management syntax and primitives + Plays well with Python + PyO3 can be used to run Rust from Python (or the contrary)
  • 16. Efficient software != Faster software + “Fast” meanings vary depending on your objectives. + Fast to develop? + Fast to maintain? + Fast to prototype? + Fast to process data? + Fast to cover all failure cases? “Selecting a programming language can be a form of premature optimization 16
  • 17. Efficient software != Faster software + “Fast” meanings vary depending on your objectives. + Fast to develop? Python is way faster + did that for 15 years + Fast to maintain? Very few people at Numberly do know Rust + Fast to prototype? No, code must be complete to compile and run + Fast to process data? Sure: to prove it, measure it + Fast to cover all failure cases? Definitely: mandatory exhaustivity + error handling primitives “I did not choose Rust to be “faster”. Our Python code was fast enough to deliver their pipeline processing. 17
  • 18. Innovation cannot exist if you don’t accept to lose time. The question is to know when and on what project. 18
  • 19. The Reliable software paradigms + What makes me slow will make me stronger. + Low level paradigms (ownership, borrowing, lifetimes). + Strong type safety. + Compilation (debug, release). + Dependency management. + Exhaustive pattern matching. + Error management primitives (Result). + Explicit return values (Option). 19
  • 20. The Reliable software paradigms + What makes me slow will make me stronger. + Low level paradigms (ownership, borrowing, lifetimes). If it compiles, it’s safe + Strong type safety. Predictable, readable, maintainable + Compilation (debug, release). Compiler is very helpful compared to random Python exceptions + Dependency management. Finally something looking sane vs Python mess + Exhaustive pattern matching. Confidence that you’re not forgetting something + Error management primitives (Result). Handle failure right from the language syntax + Explicit return values (Option). Clear separation between Some(value) and None “ I chose Rust because it provided me with the programming paradigms at the right abstraction level that I needed to finally understand and better explain the reliability and performance of my application. 20
  • 21. Learning Rust the hard way 21
  • 22. Production is not a Hello World + Learning the syntax and handling errors everywhere + Confluent Kafka + Schema Registry + Avro + Asynchronous latency-optimized design + ScyllaDB multi-datacenter + MongoDB + Kubernetes deployment + Prometheus exporter + Grafana dashboarding + Sentry Scylla processor app Confluent Kafka 22
  • 23. Confluent Kafka Schema Registry + Confluent Schema Registry breaks vanilla Apache Avro deserialization. + Consider using Gerard Klijs’ schema_registry_converter crate (v3+) + I discovered performance problems which we worked and have been addressed! + Latency-overhead-free manual approach: 23
  • 24. Apache Avro Rust was broken! + Crate apache-avro (former avro-rs) given to Apache Avro without an appointed committer. + Deserialization of complex schemas was broken... + I contributed fixes to Apache Avro (AVRO-3232+3240) + Now merged thanks to Martin Grigorov! + Make sure to use apache-avro v0.14+ + Rust compiler optimizations give a hell of a boost! + Deserializing Avro is faster than JSON! 24
  • 25. + Tricks to make your Kafka consumer strategy more efficient. + Deserialize your consumer messages on the consumer loop, not on green-thread tasks + Spawning a task has performance costs + Control your green-thread parallelism + Defer to green-thread tasks when I/O starts to be required task / msg Asynchronous patterns to optimize latency Kafka consumer + avro deserializer raw data task / msg task / msg task / msg task / msg Scylla enriched data 25
  • 26. Absorbing tail latency spikes with parallelism x16 x2 parallelism load 26
  • 27. Scylla Rust (shard-aware) driver + The scylla-rust-driver crate is production-ready. + Use a CachingSession to automatically cache your prepared statements + Beware: prepared queries are NOT paged, use paged queries with execute_iter() instead! + Use the latest optimized version 0.7.0! 27
  • 28. Exporting metrics properly for Prometheus + Effectively measuring latencies down to microseconds. + Fine tune your histogram buckets to match your expected latencies! ... 28
  • 29. Grafana dashboarding + Graph your precious metrics right! + ScyllaDB prepared statement cache size + Query and throughput rates + Kafka commits occurrence + Errors by type + Kubernetes pod memory + ... + Visualizing Prom Histograms max by (environment)(histogram_quantile(0.50, processing_latency_seconds_bucket{...})) 29
  • 30. Was it worth it? 30
  • 31. Did I really lose time because of Rust? + I spent more time analyzing the latency impacts of code patterns and drivers’ options than struggling with Rust syntax. + Key figures for this application: + Kafka consumer max throughput with processing? 200K msg/s on 20 partitions + Avro deserialization P50 latency? 75µs + Scylla SELECT P50 latency on 1.5B+ rows tables? 250µs + Scylla INSERT P50 latency on 1.5B+ rows tables? 660µs 31
  • 32. It went better than expected + Rust crates ecosystem is mature, similar to Python Package Index + 3 Python apps totalling 54 pods replaced by 1 Rust app totalling 20 pods + We helped & worked on making the scylla-rust-driver even better + Token aware policy can fallback to non-replicas for higher availability + Optimized partition key calculations for prepared statements + Expose partition key sharding to create shard-aware applications (#ScyllaSummit2023) + More to come! + This feels like the most reliable and efficient software I ever wrote! 32
  • 33. - Numberly’s journey to choosing ScyllaDB - Evaluating ScyllaDB for production 1/2 - Evaluating ScyllaDB for production 2/2 - Numberly’s use case: ScyllaDB to replace MongoDB+Hive (Scylla Summit 2018) - Numberly’s experience: MongoDB vs Scylla (Scylla Summit 2019) - Numberly’s contributions: Faster ScyllaDB Shard-Aware drivers (Scylla Summit 2021) - Scylla Summit 2023: Building a 100% shard aware application using Rust And of course: - ScyllaDB University Learning More 33
  • 34. Join our enthusiastic teams and help us face all our challenges with an innovative, benevolent and community driven mindset! - Data Engineering & Science - Software Engineering - Infrastructure We are remote friendly, so wherever you are, let’s have a chat! alexys@numberly.com Numberly is hiring 34
  • 35. Watch now on-demand at scylladb.com/summit 35 Questions?
  • 36. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/

Editor's Notes

  1. Welcome, everyone! My name is Peter Corless, Director of Technical Advocacy at ScyllaDB. I’ll be your host for today’s webinar -- “Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline.” Today you will learn about how Numberly moved key parts of their application code to Rust to optimize their real-time operational performance.
  2. Before we begin we are pushing a quick poll question. Where are you in your NoSQL adoption? I currently use ScyllaDB I currently use another NoSQL database I am currently evaluating NoSQL I am interested in learning more about ScyllaDB None of the above Ok, thanks for those responses. Let’s get started.
  3. Hello again everyone! I just want to take a moment for a quick audience poll. For a sense of scale, we’d like to understand: How much data do you have under management in your own transactional database systems? Less than 1 terabyte 1 to 10 terabytes 10-100 terabytes >100 terabytes Pick the answer that best matches your current data set. We’ll leave the poll up for a bit for you to answer.
  4. My name is Peter Corless, Director of Technical Advocacy at ScyllaDB. I listen to and help share user success stories.
  5. For those of you who are not familiar with ScyllaDB yet, it is the monstrously fast and scalable NoSQL database built for gamechangers. Created by the founders of the KVM hypervisor, ScyllaDB was conceived with key design characteristics to power this next tech cycle and resolve many of the challenges posed when operating distributed systems at scale. In particular, ScyllaDB is a high throughput and low latency distributed NoSQL database. Increasing database throughput (operations/second), improving P99 latency, and reducing total cost are principle drivers behind teams like yours for selecting ScyllaDB. In 2020, ScyllaDB received Infoworld’s prestigious Technology of the Year award, and it was truly an honor to be among fellow recipients like Tableau, Databricks, and Snowflake. Recently we launched ScyllaDB 5 with several new innovative features, and we have an on-demand webinar covering what’s new which I highly encourage you to watch… With such consistent innovation the adoption of our database technology has grown to over 400 key players worldwide…
  6. “Many of you will recognize some of the companies among the selection pictured here, such as Starbucks who leverage ScyllaDB for inventory management, Zillow for real-time property listing and updates, and Comcast Xfinity who power all DVR scheduling with ScyllaDB.” As it can be seen, ScyllaDB is used across many different industries and for entirely different types of use cases. Chat applications, IOT, social networking, e-commerce, fraud detection, security are some of the examples pictured in this slide. More than often, your company probably have an use case that is a perfect fit for ScyllaDB and it may be that you don’t know it yet! If you are interested in knowing how we can help you more, feel free to engage with us! To summarize, if you care about having low latencies while having high throughput for your application, we are certain that ScyllaDB is a good fit for you.
  7. Without any further ado, I have the pleasure of introducing to you our speaker today: Alexys Jacob, CTO at Numberly, who is known to the open source community as “Ultrabug.” A frequent ScyllaDB open source & ScyllaDB University contributor.
  8. Numberly is a digital data marketing technologist and expert helping brands connect and engage with their customers using all digital channels available We are proud to be an independent company with solid internationally recognized expertises in both marketing and technology, we just celebrated our 23rd anniversary
  9. I’ve been doing Python in production for more than 15 years now; That’s as much a marker that my age is advancing as well as a shock to my colleagues that I could even consider to code using another language than Python So what could trigger such a radical change on me?
  10. As a data company, we operate on a lot of data that is fast moving using an event driven approach that drives our technological choices towards platforms that allow us to process and react to stimulus as close to real time as possible We combine Kafka and Scylla extensively on streams and specialized pipeline applications that we’ll call data processors here Each of those pipeline data processor applications prepare and enrich the incoming data so that it is useful to the downstream business / partner or client applications
  11. The relevance of a data driven decision is at its best when it’s close to the event’s time of occurrence which means that availability and latency are business critical to us Those data processor apps, kafka, and of course Scylla can’t fail, if they do we get angry partners and clients (clic) Latency and resilience are thus the pillars upon which we build our business reliable platforms
  12. The data industry and ecosystems are always changing Last fall, we had to adapt three of the most demanding data processors written in Python Those processor applications were doing the job for more than 5 years, they were battle tested and trustworthy As you know, I’m not a low level programmer as I always felt C or C++ cumbersome and useless for my needs But I was following Rust maturation for a while : I was curious and had the feeling that it could find its place in between Python and C++ So when this opportunity came, I went to my colleagues and told them
  13. Hey why not rewrite those 3 python applications that we know work very well into one rust application which we don’t even know the language? (clic) After the shock, they asked for a rationale rather than just a crazy idea
  14. Rust makes promises that more and more people seem to agree with It is supposed to be… (read bullets) But furthermore, their marketing motto speaks to the marketer inside me (read) (Clic) That’s me! (Clic) That’s what my new processor app needs! Careful attendees would ask me: hey Alexys, you did not mention speed in that list Isn’t Rust supposed to be super fast? Well, their motto mention efficiency and it’s not the same as speed
  15. Efficient software does not always mean faster software Brett Cannon, a Python core developer, argues that selecting a programming language for being faster on paper is a form or premature optimization (clic) I agree with him in the sense that the word Fast conveys different meanings depending on your objectives In my opinion, Rust can be said to be faster as a consequence of being Efficient, which does not cover all the items on the list here Let’s demonstrate that on my context
  16. (read)… As we can see in my case, choosing Rust over Python will mean that I will definitely lose time (clic + read) So why would I want to lose time? The short answer is “innovation”
  17. (read) So the gist of my decision was that I was sure this project was the right one at the right time to foster innovation at Numberly
  18. Now what will I gain from losing time other than the pain of using semicolons and brackets everywhere? Supposedly a more reliable software thanks to Rust unique design and paradigms This is to say that what makes me slow is also an opportunity to make my software stronger
  19. (read)... (clic) (read)
  20. Here is an overview of all the aspects and all the technological stacks that I had to deal with (read)... Since our time is limited I will skip through this list to highlight the most insightful parts Let’s start with the first wall I hit right from the start: consuming messages from Kafka
  21. We use Confluent Kafka Community edition with its Schema Registry to structurate our Avro encoded messages in our Kafka topics The bad news is that Confluent Schema Registry adds a magic byte to kafka message payloads, which breaks vanilla Apache Avro schema deserialization Luckily for me, Gerard Klij has worked on a crate to address this problem. We worked together on improving its performance so that it’s production ready Before we fixed this I used the manual approach shown here to decode Avro messages myself with respect of their schema
  22. Then I hit the second wall when even if my reading of the Avro payload was possible I still could not deserialize them… As a total Rustian newbie, I blamed myself for days before even daring to suspect Apache Avro being the culprit I eventually read Apache Avro source code and discovered that it was broken for complex schemas like us Is anyone in the world using Rust Apache Avro in production yet? So here I am contributing fixes to the Apache Avro Rust implementation which eventually got merged three months later in January thanks to its newly appointed commiter Martin Anyway, another unexpected fact that Rust allowed me to prove is that deserializing Avro is faster than deserializing JSON in our case of rich and complex data structures. I say unexpected but my colleague Othmane was expecting it to be fair, I was happy to finally prove him right!
  23. Once I was finally able to consume messages from Kafka, I started looking at the best pattern to process them I turned to the tokio asynchronous runtime which was very intuitive coming from Python asyncio I played a lot with various code patterns to optimize and make consuming messages from Kafka latency stable and reliable One of the interesting findings was to not defer the decoding of Avro messages to a green-thread but do it right in the consumer loop Indeed, since deserialization is a CPU bound operation, it will benefit from not being cooperative with other green-thread tasks Similar, allowing and controlling your parallelism will help stabilize your I/O bound operations, let’s see a real example of that
  24. Once deserialization is done, deferring the rest of my processing that is I/O bound to green-threads helped absorb tail latencies without affecting my kafka consuming speed The Grafana dashboard you see here shows that around 9:00 something made Scylla slower than usual, scylla select and insert P95 latencies went up by 16 At the same time, you can see a bump in my parallelism load as I started having more concurrent green-threads processing messages But it only hit my kafka consuming latency by a factor of 2 at P95, effectively absorbing tail latencies due to this ephemeral overload in Scylla This is the typical example of something that was harder to pinpoint and demonstrate in Python but became clear with Rust
  25. Now to our dear Scylla I found the Scylla Rust driver to be intuitive and well featured, congratulations to the team which is also very helpful on their dedicated channel on the Scylla Slack service, join us there! The new CachingSession is very handy to cache your prepared statements so you don’t have to do it yourself like I did at first (read) beware I’m showcasing a code example of a production connection function to scylla, using SSL, multi-datacenter awareness and a caching session. Speaking of multi-datacenter awareness, I hit a bug in the Token aware load balancing that promptly got fixed by the team and released in 0.4.2
  26. Even if it’s described late in the presentation, Prometheus is actually the first thing I set up on my application so that I could measure the latency and throughput impacts of all the experiments I did For a test to be meaningful, those measurements must be made right and then graphed right. So here is an example of how I measure scylla query insertion latency The first and important gotcha is to setup your histogram bucket correctly with your expected graphing finesse Here I expect scylla latency to vary between 50µs and 15s which is the maximal server timeout I’m allowing for writes Then I use it like this: I start a timer on the histogram and record its duration on success and drop it on failure so that my metrics are not polluted by possible errors
  27. Once you measure right, you need to visualize and graph right I created a detailed and meaningful Grafana dashboard so I could see and compare the results of my Rust application experimentations, best time invested ever Make sure you graph as much things as possible (read) There are gotchas into graphing prometheus histograms so I’m linking a great article that the folks at Grafana wrote on how to visualize them right in Grafana
  28. The syntax was surprisingly simple and intuitive to adopt even coming from Python I absolutely failed to resist the temptation of testing and analyzing everything at a lower level, it was an unexpected new joy for me So in the end most of my time was spent on testing, graphing, analyzing and trying to come up with a decent and insightful explanation This surely does not look like wasted time to me! For the number hungry of you in the audience, here are some numbers taken from the application Kafka consumer max throughput with processing? 200K msg/s on 20 partitions Avro deserialization P50 latency? 75µs Scylla SELECT P50 latency on 1.5B+ rows tables? 250µs Scylla INSERT P50 latency on 1.5B+ rows tables? 2ms
  29. It went way better than expected (read)... Even if it was my first Rust application, I felt confident during the development process which transformed into confidence in a predictable and resilient software. After months of production, the new Rust pipeline processor proves to be very stable and resilient. Sentry is bored. Rust promises are living up to expectations!
  30. I selected a few articles and videos of Numberly’s experience if you want more material to deepen your understanding or widen your scope of knowledge around ScyllaDB database and ecosystem And of course, make sure to check out the excellent content from ScyllaDB University
  31. COME UP 2-3 SEED QUESTIONS FOR Q/A ORGANIC QUESTIONS FROM FIRST PRESENTATION Which resources did you use to learn rust? Great presentation, thanks! What are some resources you recommend to get started on Rust? How did you find the ramp up time for others on your team in their path to becoming proficient in rust? Has it been difficult not having a Rust expert on your team? What challenges did you encounter and any learnings? POSSIBLE SEED QUESTIONS – PROVIDED TO US FROM INFOQ TEAM You mention a few points about introducing new technologies thoughtfully. Clearly, you were convinced that this scenario was the right project and the timing was right. Did you have to go and prove that to skeptical leaders and developers? If so, how did you convince them? Are there any examples that you think would help folks in the audience make a similar case? Similarly, under "was it worth it?" you share some great personal feedback. Is it fair to say that the application telemetry indicated better performance, reliability, and stability as a result? Was the end user experience remarkably better? Is the team able to ship or learn faster as a result? What would the call to action be for folks, as a result of these learnings? To re-examine technology choices for certain scenarios? If so, what are those scenarios? As a side note, MongoDB is indeed still used in this pipeline as write-only output database which another platform still depends on, we have plans to replace it with Scylla in the future
  32. Thank you all very much for attending today. In due time, you will find this presentation available on the InfoQ and ScyllaDB website for on-demand viewing. If you would like to weigh in on what we present in the future, please Contact Us, either via the form on our website, or on Twitter. We’d love to hear your ideas. For now, on behalf of ________ and myself, and all of us at ScyllaDB, enjoy the rest of your day.