Sizing Your Scylla Cluster

Eyal Gutkind, VP of Solutions
Maheedhar Gunturu, Solutions Architect
Sizing your
Scylla Cluster

Sizing your Scylla Cluster
A walk through the process

Presenters
Maheedhar Gunturu, Solutions Architect
Prior to ScyllaDB, Maheedhar held senior roles both in engineering and sales
organizations. He has over a decade of experience designing & developing server-side
applications in the cloud and working on big data and ETL frameworks in companies
such as Samsung, MapR, Apple, VoltDB, Zscaler and Qualcomm. He holds a masters in
Electrical and computer engineering from the University of Texas at San Antonio.
3
Eyal Gutkin, VP of Solutions
Eyal Gutkind is VP of Solutions at ScyllaDB. Prior to joining ScyllaDB, Eyal held product
management roles at Mirantis and DataStax, and spent 12 years with Mellanox
Technologies in various engineering management and product marketing roles. Eyal
holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion
University in Israel and an MBA from Fuqua School of Business at Duke University.

Agenda
+ Understand your workload
+ The machines
+ Let’s build a system
+ How it will look at different IaaS
+ Our sizing process

Thinking about database cluster sizing?!
5

Make sure you have all requirements set
6

7
■ Business

8
■ Business
■ Application

9
■ Business
■ Application
■ Infrastructure

10
■ Business
■ Application
■ Infrastructure
■ Resiliency

11
■ Business
■ Application
■ Infrastructure
■ Resiliency
■ Developer and Operator Friendliness

The obvious...
■ Data volume ingested per second/hour/day/year
■ Data attrition policy
■ Data format : Text, Binary blob
■ Required replication factor
■ What’s in your storage system?
12

The shape of your workload
https://www.scylladb.com/2019/05/23/workload-prioritization-running-oltp-and-olap-traffic-on-
the-same-superhighway/

Impact of data model
■ Materialized View, Secondary Index
● Increased CPU and I/O

Impact of data model
■ Materialized View, Secondary Index
● Increased CPU and I/O
■ Partition Size and per operation payload
■ Tables and keyspaces

Impact of auxiliary systems
■ Consider the workload from
● Spark
● Kafka/Nifi
● KairosDB
● JanusGraph

Someone needs to do the job!
19

Memory and data volume
■ Keep disk space to memory at a reasonable ratio
20

Connectivity
■ Enable enough bandwidth for client-server and server-server
communication
21

Connectivity
■ Enable enough bandwidth for client-server and server-server
communication
■ Geo-replication benefits
22

Let’s build a system
Business requirements:
■ 1st year customers: 30M profiles
■ Number of records per profile: 12
■ Avg. size of each record: 3KB
■ Data types: text
24

Application requirements:
■ 99% write response time: 15ms
■ 99% read response time: 10ms
■ Peak throughput: 150,000
operations/sec
■ Read:Write ratio: 70:30
25

operations/sec
Infrastructure:
■ AWS
■ Available instance types: all
■ Multi-DC: Oregon and N.Virgina
■ OS: CentOS 7.6
■ Replication Factor: 3
26

operations/sec
Infrastructure:
■ AWS
■ Available instance types: all
■ Multi-DC: Oregon and N.Virgina
■ OS: CentOS 7.6
Auxiliary applications:
■ Spark
27

How it looks
at different IaaS

■ End of year Total raw data size: 2TB, Starting with ~1TB
■ Typical record size read/written by the application: 3KB
■ Data model: 20-30 tables, up to 20 columns per row, 10-50 rows per
partition, mainly text
■ Latency requirements: 10ms Write, 15ms Read, for the 99%
■ Throughput: 150,000 database op/s
■ IaaS: AWS, multi-region, multi-availability-zone, N. Virginia and
Oregon
■ Spark for analytics
29

Let’s build a system- recap
■ ~10,000 operations per second per physical CPU
■ We use STCS, so need 2x the disk space of replicated data
■ Latency requirement vs. amount of RAM to increase cache hit ratio
■ Make sure using reasonable number of tables/keyspaces per cluster
■ Any usage of MV/Secondary indexes, or auxiliary system?

Let’s build a system, Amazon Web Services
■ Per Data Center
● Needed disk space: 12TB
● Media type: NVMe drives to meet latency SLA
● Requires 30 threads, 15 physical cores
■ Per Data center instance options
● 3 x i3.4xlarge → Total disk: 11.5TB
and
● 3 x i3.2xlarge for the Spark cluster
● 1x i3.2xlarge for Scyla monitoring and Scylla manager
31

Let’s build a system with i3en , AWS
■ Per Data Center
● 3 x i3en.3xlarge → Total disk: 22.5TB!
and
■ 1x i3.2xlarge for Scyla monitoring and Scylla manager
32

Let’s build a system, Azure
■ Per Data Center
● 3 x standard L16 v2 → Total disk: 11.5TB
and
● 3 x standard L8 v2 for the Spark cluster
● 1x standard L8 v2 for Scyla monitoring and Scylla manager
33

Let’s build a system, Google Cloud
■ Per Data Center
● 6 x n1-standard-16 + 5x NVMe based
direct attached, 375GB drives
and
● 3 x n1-standard-8 for the Spark cluster
● 1 x n1-standard-8 for Scyla monitoring and Scylla manager
34

Let’s build a system, Scylla Cloud
■ Per Data Center
● 3 x i3.4xlarge → Total disk: 11.5TB
and
35

Let’s build a system, on premise
■ Per Data Center
● 3 machines with 8 physical cores each and at least 4TB of SSD direct attached drives
● 3 machines with 4 or more physical cores for Spark
● 1 x machine for Scyla monitoring and Scylla manager
● Scylla nodes and Spark nodes: 128GB RAM
● Network: 10GbE
36

Summary
■ Do not think only storage!
■ Gather applications and business requirements
● Throughput and SLAs
● Growth expectations
● Security and compliance needs
■ Select the right infrastructure
■ Think about resiliency and high availability
Ask us questions!
39

Thank you Stay in touch
Any questions?
Eyal Gutkind
eyal@scylladb.com
@gutkinde
Maheedhar Gunturu
maheedhar@scylladb.com
@vanguard_space

Sizing Your Scylla Cluster

More Related Content

What's hot

Similar to Sizing Your Scylla Cluster

More from ScyllaDB

Recently uploaded

Sizing Your Scylla Cluster