How to be Successful with Scylla

How to be successful
with Scylla
Glauber Costa, VP Field Engineering

Presenter
Glauber Costa, VP Field Engineering
Glauber Costa is VP of Field Engineering at ScyllaDB. He shares
his time between the engineering department working on
upcoming Scylla features and helping customers succeed.
Before ScyllaDB, Glauber worked with Virtualization in the Linux
Kernel for 10 years, with contributions ranging from the Xen
Hypervisor to all sorts of guest functionality and containers.

Scylla is compatible with
other databases

Welcome Cassandra users
What to remember, what to forget?
■ Remember: Data model and consistency issues.
■ Forget: Operational best practices
● Users insist on tuning the system in the exact way as before
● Many times a change was done to work around an issue.
● Issue may not exist in Scylla
■ Example: compactions being too slow, compactions being too fast, etc.

Corollary: comparing databases
Wrong way to compare databases:
■ I will now run Scylla the same way I ran Cassandra for the past 5 years
● It will work.
● It will be suboptimal.
Right way to compare databases:
■ This is the work that I need to do, and this is how much it costs me
● Run each offering in their operational sweet spot.

Use an updated version
■ Policy is that only two versions receive updates.
■ For enterprise:
● 2019.1 and 2018.1 supported
■ For Open Source: 3.1 is released:
● 3.0 and 3.1 are supported
● 2.3 is EOLd
■ It’s fine to be conservative, but:
● Running 3.0 is conservative
● Running 2.3 is dangerous.
■ Patchlevel updates are very safe, do them.
● but don’t stream between minor versions.

Hardware Selection
What is your bottleneck?
■ CPU
● Understand the per-core capacity of your workload
■ Storage
● Latency: NVMe
● Throughput: SSD
● Forget HDDs.
■ Network
● Forget anything below 1Gbps.

Storage layout
How to best organize many disks
■ RAID0
● Database is replicated, why mirror disks?
■ LVM (in striping mode)
Split commitlog and data?
■ Generally not worth it
● Can maybe help with overwrite heavy workloads
■ If you have super fast disks lying around that’s fine

Hardware Sizing
If you knew you’d need 1,000 USD in a trip, would you take 1,000 USD?
■ See Eyal’s presentation on how to size
● But then remember to add spare!
● Test your performance under bootstrap and decommission
● Make sure you know how long does it take to bootstrap and decommission
■ Asking us and estimates are not good enough: very data-model dependent

Packing the iron
In which situations should I run more, smaller nodes?

Packing the iron
■ Never
Time to ingest. Dataset grows 2x as machine grows 2x.

Packing the iron
■ Never - if choice is the same amount of resources
■ In practice, ok to smooth out expansion
32 cores
32 cores
32 cores
16 cores
16 cores
16 cores
16 cores
total 64
cores.
total 48 cores.
expansion
expansion

Rack awareness
Run as many racks as you have replicas.
■ RF=3 and 3 Racks:
● Scylla will place a full copy in each rack. Perfect balancing, perfect resiliency
Rack 1
Rack 2
Rack 3
Replica 1
Replica2
Replica 3
Data

Rack awareness
rack failure:
1 copy gone.
QUORUM maintained.
Rack 1
Rack 2
Rack 3
Replica 1
Replica2
Replica 3
Data

Rack awareness
● Scylla will place a full copy in each rack and a third copy spread in both racks
■ Rack failure can lead to decreased HA: needed quorum is down
■ Rack failure can lead to data loss: two copies are lost.
Rack 1
Rack 2
Replica 1
Replica 2
Replica 3
Data

Rack awareness
Rack failure, Scenario 1:
QUORUM maintained
Rack 1
Rack 2
Replica 1
Replica 2
Replica 3
Data

Rack awareness
Rack 1
Rack 2
Replica 1
Replica 2
Replica 3
Data
Rack failure, Scenario 2:
You may have lost your
job.

Rack awareness
■ All conditioned to NetworkTopologyStrategy
■ When expanding the cluster, add 3 nodes

Run the setup tool
scylla_setup is constantly updated with knowledge of what’s important
■ If I had to choose one configuration to always enforce:
● SET_NIC_AND_DISKS=yes
● SET_NIC in older versions
■ What does that do:
Scylla
CPU time
Linux
SoftIRQ
time
Scylla
CPU time
Scylla
CPU time OR

Run the setup tool
scylla_setup is constantly updated with knowledge of what’s important
■ If I had to choose one configuration to always enforce:
● SET_NIC_AND_DISKS=yes
● SET_NIC in older versions
■ Not yet in setup:
● Taking timestamps
■ TSC clocksource: 26ns
■ Xen clocksource: 100ns
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
xen

Prepare your statements
Ad-hoc, rare queries are the only excuse not to prepare statements.

Partition sizing
Large partitions per se are (almost) not a problem anymore.
■ SELECT * from table where pk = ? and ck = ? OK
■ SELECT * from table where pk = ? That’s the issue

Partition distribution
Bad partition distribution are not a unique Scylla issue
■ But they can show up sooner in Scylla due to shards.
● That’s a good thing

Understand the caching basics
■ Cache is LRU on rows
● Use BYPASS CACHE for analytical workloads
■ Careful with moving range queries
● SELECT * from time_series where pk = ? and time >= past and time >= now; BAD
● SELECT * from time_series where pk = ? and time >= past;
GOOD

Control parallelism
■ Low parallelism hurts Scylla
● Fewer units will be working, database will not be efficient
■ Is there such a thing as too high?

Control parallelism
■ Infinite parallelism is asking for trouble
● I don’t mean ∞
● 4 trillion concurrent requests is infinite in the real world
■ No need to guess:
● C = T x L
● Example: 200,000 requests/s at 10ms average latency:
■ C = 200,000 * 0.001
■ C = 2,000.
■ Driver settings:
● Number of connections x maximum requests per connection
● Remaining requests will be queued in the application side.

Be very careful with retries
■ If client timeout < server timeout:
● effectively increase parallelism
● Know the difference between reads, range reads, writes, etc.
■ Be very careful with speculative retry
● Remember it will be retried in the same replica set that just took long to respond
● Now it will take even longer because of the extra request
■ And more speculative retries.

Batching writes
Should writes be batched?
■ Latency of the operation is the latency of the batch
● Therefore it may fail.
■ Scylla is token aware, shard aware
● Batches may break that
● Summary: only batch updates to the same partition

Thank you Stay in touch
Any questions?
Glauber Costa
glauber@scylladb.com
@glcst

How to be Successful with Scylla

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How to be Successful with Scylla

Similar to How to be Successful with Scylla (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

How to be Successful with Scylla