SlideShare a Scribd company logo
1 of 31
High-load storage of users’ actions
with Scylla and HDDs
Kirill Alekseev
Kirill Alekseev
Software Engineering Team Lead, Mail.Ru Group
■ Software Engineering Team Lead @ Mail Service @ Mail.Ru Group
■ Master’s degree in Computer Science in 2019 @ Lomonosov Moscow State
University
■ Love coding, music and parties
YOUR COMPANY
LOGO HERE
2
19 million
unique real users DAU
47 million
unique real users MAU
3
1 000 000
letters per minute
4
High-load storage of
users’ actions
Service overview
5
6
Service overview
Service overview
Basically, actions history is a time series
of actions stored by email:
7
user | system.totimestamp(time) | ip | project_id | action_id
-----------------+--------------------------------------+---------------+------------+------------
test@mail.ru | 2020-11-15 15:22:46.000000+0000 | 172.27.56.34 | 3 | 4
test@mail.ru | 2020-11-15 15:22:45.000000+0000 | 172.27.56.34 | 3 | 13
test@mail.ru | 2020-11-15 15:22:41.000000+0000 | 172.27.56.34 | 3 | 20
test@mail.ru | 2020-11-15 15:22:23.000000+0000 | 172.27.56.34 | 3 | 4
test@mail.ru | 2020-11-15 15:22:22.000000+0000 | 172.27.56.34 | 3 | 120
8
HTTP API
Mail Service Cloud Service Calendar Service
write action by user read a list of actions by user
65000
peak API write RPS
50
peak API read RPS
9
Problems of previous storage
The previous storage had the following problems:
▪ poor scalability
▪ difficult to maintain
▪ lack of must-have DBMS features (secondary indexes, tunable
replication, query language etc)
10
Scylla as a storage
for users’ actions
Cluster and data model overview, hardware specs
11
12
HTTP API
Mail Service Cloud Service Calendar Service
write action by user read a list of actions by user
Cluster overview
▪ 2 DCs, 4+5 nodes, RF=1 inside each DC
▪ CL=ONE for writes/reads
▪ Bare metal
• 2 x Intel Xeon Gold 6230
• 6 x 32GB DDR4 2666 MHz
• 2 x SATA SSD 1TB RAID 1 for clogs, 10 x HDD 16TB RAID 10 for data
• 10 Gb/s Network
13
CREATE TABLE becca.events (
user text, year smallint, week tinyint,
time timeuuid,
project_id smallint, event_id smallint,
ip inet, args map<text, text>,
PRIMARY KEY ((user, year, week, project_id), time)
) WITH CLUSTERING ORDER BY (time DESC)
Data model
▪ Partition is a list of actions sorted by time
▪ Partition is identified by user, year, week and project
▪ Thanks to promoted index, large partitions can be iterated using the ‘time’
column
▪ We use Time Window Compaction Strategy with size of 1 week
14
Reading by a secondary key
15
▪ Out-of-the-box secondary indexes give unpredictable
performance and lots of random IO
▪ Materialized views require a read-before-update for every
write operation (not gonna work with HDDs)
▪ Duplicating writes to a separate table by a different partition
key
CREATE TABLE becca.events_by_ip (
ip text, year smallint, week tinyint,
user text, time timeuuid,
project_id smallint, event_id smallint,
ip inet, args map<text, text>,
PRIMARY KEY ((ip, year, week, project_id), time, user)
) WITH CLUSTERING ORDER BY (time DESC)
Secondary key data model
▪ Requires 2x space and 2x write load
▪ Gives predictable performance on reads
16
240 000
writes per second
95% ~1.5ms, 99.9% ~22ms
17
+4TB
of compressed
data every week
18
10 (100 peak)
reads per second
Avg ~120ms, 95% ~400ms,
99.9% ~650ms
19
Using Scylla with HDDs
Potential problems and possible solutions to them
20
num-io-queues
21
▪ num-io-queues stands for a number of threads
that interact with disks
▪ You have to find your sweet spot so that throughput is
optimal and latencies are ok (Little’s Law)
▪ 10 HDDs in RAID 10 provide the maximum concurrency of
5 for writes, set num-io-queues to 4-5
Cluster repairs
22
▪ nodetool repair does not finish in acceptable
time (months)
▪ nodetool repair overloads cluster (read
latencies grow 4 times)
▪ We came up with a more IO-efficient way to repair a
cluster in our case
Cluster repairs
23
week
42
week
43
week
44
week
45
week
46
time
week
45
week
45
week
42
week
43
week
44
week
45
week
46
time
week
45
Cluster repairs
24
▪ nodetool refresh will finish quickly
▪ compactions of new data will be triggered, but the
cluster will not be overloaded
▪ compactions will finish in a couple of hours
Problems yet to be solved
25
The following problems are yet to be solved:
▪ latencies grow during compactions, cleanup, bootstrap
▪ latencies grow when a node is down
▪ slow bootstrapping
$150 000
CAPEX saved per 1PB
26
Conclusion
27
Results
We have achieved the following results:
▪ we have built a high-load service for storing users’
actions with Scylla and HDDs
▪ the given service is able to handle 240 000 writes per second
with 95% of timing equal to 1.5ms with just a few Scylla nodes
▪ we have implemented an approach to serve reads by a
secondary key with predictable performance
28
Future work
In 2021:
▪ third DC
▪ optimize Scylla and clients to get even better latencies
▪ integrate Scylla into more projects
29
Special Thanks
I would like to give special thanks to:
▪ Dmitry Pavlov, Pavel Buchinchik, Igor Platonov
▪ Vladislav Zolotarov, Avi Kivity, Raphael Carvalho
▪ The whole ScyllaDB team
31
Thank You
gibsn@mail.ru
Kirill Alekseev
32

More Related Content

What's hot

How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintScyllaDB
 
How to be Successful with Scylla
How to be Successful with ScyllaHow to be Successful with Scylla
How to be Successful with ScyllaScyllaDB
 
Using ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityUsing ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityScyllaDB
 
Building a Distributed Data Streaming Architecture for Modern Hardware with S...
Building a Distributed Data Streaming Architecture for Modern Hardware with S...Building a Distributed Data Streaming Architecture for Modern Hardware with S...
Building a Distributed Data Streaming Architecture for Modern Hardware with S...ScyllaDB
 
Sizing Your Scylla Cluster
Sizing Your Scylla ClusterSizing Your Scylla Cluster
Sizing Your Scylla ClusterScyllaDB
 
Scylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScyllaDB
 
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseFireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseScyllaDB
 
How SkyElectric Uses Scylla to Power Its Smart Energy Platform
How SkyElectric Uses Scylla to Power Its Smart Energy PlatformHow SkyElectric Uses Scylla to Power Its Smart Energy Platform
How SkyElectric Uses Scylla to Power Its Smart Energy PlatformScyllaDB
 
Scylla Summit 2018: Scylla 3.0 and Beyond
Scylla Summit 2018: Scylla 3.0 and BeyondScylla Summit 2018: Scylla 3.0 and Beyond
Scylla Summit 2018: Scylla 3.0 and BeyondScyllaDB
 
Performance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. DatastaxPerformance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. DatastaxScyllaDB
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0ScyllaDB
 
How ReversingLabs Serves File Reputation Service for 10B Files
How ReversingLabs Serves File Reputation Service for 10B FilesHow ReversingLabs Serves File Reputation Service for 10B Files
How ReversingLabs Serves File Reputation Service for 10B FilesScyllaDB
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalScyllaDB
 
ScyllaDB's Avi Kivity on UDF, UDA, and the Future
ScyllaDB's Avi Kivity on UDF, UDA, and the FutureScyllaDB's Avi Kivity on UDF, UDA, and the Future
ScyllaDB's Avi Kivity on UDF, UDA, and the FutureScyllaDB
 
Scylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScyllaDB
 
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScyllaDB
 
Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?ScyllaDB
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScyllaDB
 
Seastar Summit 2019 vectorized.io
Seastar Summit 2019   vectorized.ioSeastar Summit 2019   vectorized.io
Seastar Summit 2019 vectorized.ioScyllaDB
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphScyllaDB
 

What's hot (20)

How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter Footprint
 
How to be Successful with Scylla
How to be Successful with ScyllaHow to be Successful with Scylla
How to be Successful with Scylla
 
Using ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityUsing ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber Security
 
Building a Distributed Data Streaming Architecture for Modern Hardware with S...
Building a Distributed Data Streaming Architecture for Modern Hardware with S...Building a Distributed Data Streaming Architecture for Modern Hardware with S...
Building a Distributed Data Streaming Architecture for Modern Hardware with S...
 
Sizing Your Scylla Cluster
Sizing Your Scylla ClusterSizing Your Scylla Cluster
Sizing Your Scylla Cluster
 
Scylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native Database
 
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseFireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
 
How SkyElectric Uses Scylla to Power Its Smart Energy Platform
How SkyElectric Uses Scylla to Power Its Smart Energy PlatformHow SkyElectric Uses Scylla to Power Its Smart Energy Platform
How SkyElectric Uses Scylla to Power Its Smart Energy Platform
 
Scylla Summit 2018: Scylla 3.0 and Beyond
Scylla Summit 2018: Scylla 3.0 and BeyondScylla Summit 2018: Scylla 3.0 and Beyond
Scylla Summit 2018: Scylla 3.0 and Beyond
 
Performance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. DatastaxPerformance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. Datastax
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0
 
How ReversingLabs Serves File Reputation Service for 10B Files
How ReversingLabs Serves File Reputation Service for 10B FilesHow ReversingLabs Serves File Reputation Service for 10B Files
How ReversingLabs Serves File Reputation Service for 10B Files
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
 
ScyllaDB's Avi Kivity on UDF, UDA, and the Future
ScyllaDB's Avi Kivity on UDF, UDA, and the FutureScyllaDB's Avi Kivity on UDF, UDA, and the Future
ScyllaDB's Avi Kivity on UDF, UDA, and the Future
 
Scylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi Kivity
 
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
 
Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Seastar Summit 2019 vectorized.io
Seastar Summit 2019   vectorized.ioSeastar Summit 2019   vectorized.io
Seastar Summit 2019 vectorized.io
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraph
 

Similar to High-Load Storage of Users’ Actions with ScyllaDB and HDDs

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKristofferson A
 
Oracle real application_cluster
Oracle real application_clusterOracle real application_cluster
Oracle real application_clusterPrabhat gangwar
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AILex Yu
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesSeveralnines
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchSylvain Wallez
 
Taming Oracle EBS R12.x Accrual Reconciliation Load
Taming Oracle EBS R12.x Accrual Reconciliation LoadTaming Oracle EBS R12.x Accrual Reconciliation Load
Taming Oracle EBS R12.x Accrual Reconciliation LoadJoshua Johnson, MIS
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloudOVHcloud
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Databricks
 
Alto Desempenho com Java
Alto Desempenho com JavaAlto Desempenho com Java
Alto Desempenho com Javacodebits
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral ProgramBig Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Programinside-BigData.com
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices Ted Wennmark
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellEmily Ikuta
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseParesh Patel
 

Similar to High-Load Storage of Users’ Actions with ScyllaDB and HDDs (20)

Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
 
Oracle real application_cluster
Oracle real application_clusterOracle real application_cluster
Oracle real application_cluster
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AI
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
 
Black friday logs - Scaling Elasticsearch
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling Elasticsearch
 
Taming Oracle EBS R12.x Accrual Reconciliation Load
Taming Oracle EBS R12.x Accrual Reconciliation LoadTaming Oracle EBS R12.x Accrual Reconciliation Load
Taming Oracle EBS R12.x Accrual Reconciliation Load
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
Alto Desempenho com Java
Alto Desempenho com JavaAlto Desempenho com Java
Alto Desempenho com Java
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral ProgramBig Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

High-Load Storage of Users’ Actions with ScyllaDB and HDDs

  • 1. High-load storage of users’ actions with Scylla and HDDs Kirill Alekseev
  • 2. Kirill Alekseev Software Engineering Team Lead, Mail.Ru Group ■ Software Engineering Team Lead @ Mail Service @ Mail.Ru Group ■ Master’s degree in Computer Science in 2019 @ Lomonosov Moscow State University ■ Love coding, music and parties YOUR COMPANY LOGO HERE 2
  • 3. 19 million unique real users DAU 47 million unique real users MAU 3
  • 4. 1 000 000 letters per minute 4
  • 5. High-load storage of users’ actions Service overview 5
  • 7. Service overview Basically, actions history is a time series of actions stored by email: 7 user | system.totimestamp(time) | ip | project_id | action_id -----------------+--------------------------------------+---------------+------------+------------ test@mail.ru | 2020-11-15 15:22:46.000000+0000 | 172.27.56.34 | 3 | 4 test@mail.ru | 2020-11-15 15:22:45.000000+0000 | 172.27.56.34 | 3 | 13 test@mail.ru | 2020-11-15 15:22:41.000000+0000 | 172.27.56.34 | 3 | 20 test@mail.ru | 2020-11-15 15:22:23.000000+0000 | 172.27.56.34 | 3 | 4 test@mail.ru | 2020-11-15 15:22:22.000000+0000 | 172.27.56.34 | 3 | 120
  • 8. 8 HTTP API Mail Service Cloud Service Calendar Service write action by user read a list of actions by user
  • 9. 65000 peak API write RPS 50 peak API read RPS 9
  • 10. Problems of previous storage The previous storage had the following problems: ▪ poor scalability ▪ difficult to maintain ▪ lack of must-have DBMS features (secondary indexes, tunable replication, query language etc) 10
  • 11. Scylla as a storage for users’ actions Cluster and data model overview, hardware specs 11
  • 12. 12 HTTP API Mail Service Cloud Service Calendar Service write action by user read a list of actions by user
  • 13. Cluster overview ▪ 2 DCs, 4+5 nodes, RF=1 inside each DC ▪ CL=ONE for writes/reads ▪ Bare metal • 2 x Intel Xeon Gold 6230 • 6 x 32GB DDR4 2666 MHz • 2 x SATA SSD 1TB RAID 1 for clogs, 10 x HDD 16TB RAID 10 for data • 10 Gb/s Network 13
  • 14. CREATE TABLE becca.events ( user text, year smallint, week tinyint, time timeuuid, project_id smallint, event_id smallint, ip inet, args map<text, text>, PRIMARY KEY ((user, year, week, project_id), time) ) WITH CLUSTERING ORDER BY (time DESC) Data model ▪ Partition is a list of actions sorted by time ▪ Partition is identified by user, year, week and project ▪ Thanks to promoted index, large partitions can be iterated using the ‘time’ column ▪ We use Time Window Compaction Strategy with size of 1 week 14
  • 15. Reading by a secondary key 15 ▪ Out-of-the-box secondary indexes give unpredictable performance and lots of random IO ▪ Materialized views require a read-before-update for every write operation (not gonna work with HDDs) ▪ Duplicating writes to a separate table by a different partition key
  • 16. CREATE TABLE becca.events_by_ip ( ip text, year smallint, week tinyint, user text, time timeuuid, project_id smallint, event_id smallint, ip inet, args map<text, text>, PRIMARY KEY ((ip, year, week, project_id), time, user) ) WITH CLUSTERING ORDER BY (time DESC) Secondary key data model ▪ Requires 2x space and 2x write load ▪ Gives predictable performance on reads 16
  • 17. 240 000 writes per second 95% ~1.5ms, 99.9% ~22ms 17
  • 19. 10 (100 peak) reads per second Avg ~120ms, 95% ~400ms, 99.9% ~650ms 19
  • 20. Using Scylla with HDDs Potential problems and possible solutions to them 20
  • 21. num-io-queues 21 ▪ num-io-queues stands for a number of threads that interact with disks ▪ You have to find your sweet spot so that throughput is optimal and latencies are ok (Little’s Law) ▪ 10 HDDs in RAID 10 provide the maximum concurrency of 5 for writes, set num-io-queues to 4-5
  • 22. Cluster repairs 22 ▪ nodetool repair does not finish in acceptable time (months) ▪ nodetool repair overloads cluster (read latencies grow 4 times) ▪ We came up with a more IO-efficient way to repair a cluster in our case
  • 24. Cluster repairs 24 ▪ nodetool refresh will finish quickly ▪ compactions of new data will be triggered, but the cluster will not be overloaded ▪ compactions will finish in a couple of hours
  • 25. Problems yet to be solved 25 The following problems are yet to be solved: ▪ latencies grow during compactions, cleanup, bootstrap ▪ latencies grow when a node is down ▪ slow bootstrapping
  • 26. $150 000 CAPEX saved per 1PB 26
  • 28. Results We have achieved the following results: ▪ we have built a high-load service for storing users’ actions with Scylla and HDDs ▪ the given service is able to handle 240 000 writes per second with 95% of timing equal to 1.5ms with just a few Scylla nodes ▪ we have implemented an approach to serve reads by a secondary key with predictable performance 28
  • 29. Future work In 2021: ▪ third DC ▪ optimize Scylla and clients to get even better latencies ▪ integrate Scylla into more projects 29
  • 30. Special Thanks I would like to give special thanks to: ▪ Dmitry Pavlov, Pavel Buchinchik, Igor Platonov ▪ Vladislav Zolotarov, Avi Kivity, Raphael Carvalho ▪ The whole ScyllaDB team 31

Editor's Notes

  1. Let’s talk numbers Does not include bots, only real users
  2. We store every action User may want to see what happened in his mailbox Another examples: investigating possible attacks, sorting out user complaints
  3. The thing that we wanted to replace in this scheme was the storage
  4. Writes prevail 1000 times
  5. The thing that we wanted to replace in this scheme was the storage
  6. Tell why we have different amount of nodes in different dcs Think what to answer to questions about CL=ONE We want to be available when a DC goes down, it’s ok for us to serve inconsistent read requests
  7. All user data is split by weeks and projects
  8. Ambiguous number of network requests to other nodes Can’t trasform all those writes to reads We create another table and duplicate all writes there from the app
  9. All user data is split by weeks and projects
  10. Latencies are measured from client RPS == API rps + RF + secondary index
  11. Remind that we are talking about HDDs
  12. They do not recommend hdds It is reasonable
  13. In ssd setups it will be probably set to some large value like number of shards The most accurate way is to run benchmarks with different values for num-io-queue
  14. Lets say one node failed and we know the exact moment of time when it happened Normally nodetool repair would run full scan but we now the exact moment when problem happened We need to go to nodes from a different DC, transfer data to the affected node and run nodetool repair
  15. Refresh will finish soon, then go compactions that do not overload cluster and in our case finished in 6 hours
  16. Latencies stay in a reasonable range Resharding is slow but faster than repair and does not overaload cluster
  17. Dedicated a whole section for problems with HDDs, what for